From lkcd-general-owner@lists.sourceforge.net Thu Nov 01 11:25:39 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 15zNSv-0000vE-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 01 Nov 2001 11:25:25 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA1JPO019276
	for <lkcd@oss.sgi.com>; Thu, 1 Nov 2001 11:25:24 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fA1JTh702751
	for <lkcd@oss.sgi.com>; Thu, 1 Nov 2001 11:29:44 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: <lkcd@oss.sgi.com>
Message-ID: <Pine.LNX.4.30.0111011129210.2726-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [lkcd-general] Test - please ignore
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov  1 11:26:17 2001
X-Original-Date: Thu, 1 Nov 2001 11:29:43 -0800 (PST)

This is a test message.  Nothing to see here, move along. :)

--Matt



From lkcd-general-owner@lists.sourceforge.net Mon Nov 05 23:26:14 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1610cd-0000F2-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 05 Nov 2001 23:26:11 -0800
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA61o1000381
	for <lkcd@oss.sgi.com>; Mon, 5 Nov 2001 17:50:01 -0800
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.99.140.23])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id UAA46742
	for <lkcd@oss.sgi.com>; Mon, 5 Nov 2001 20:47:24 -0500
Received: from d03nm038.boulder.ibm.com (d03nm038.boulder.ibm.com [9.99.140.38])
	by westrelay02.boulder.ibm.com (8.11.1m3/NCO v4.98) with ESMTP id fA61nxs186512
	for <lkcd@oss.sgi.com>; Mon, 5 Nov 2001 18:49:59 -0700
Importance: Normal
To: lkcd@oss.sgi.com
X-Mailer: Lotus Notes Release 5.0.4  June 8, 2000
Message-ID: <OFF2E00167.1A322A44-ON88256AFC.00097E17@boulder.ibm.com>
From: "James Washer" <washer@us.ibm.com>
X-MIMETrack: Serialize by Router on D03NM038/03/M/IBM(Release 5.0.8 |June 18, 2001) at
 11/05/2001 06:49:59 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Subject: [lkcd-general] lkcd license?
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov  5 23:27:02 2001
X-Original-Date: Mon, 5 Nov 2001 17:44:40 -0800

I've looked through the source, and the web page.. and I've not been able
to determine what license "lkcd" is released under..

Does anyone know?

 - jim



From lkcd-general-owner@lists.sourceforge.net Tue Nov 06 08:02:17 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1618g2-00017C-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 06 Nov 2001 08:02:14 -0800
Received: from d12lmsgate-3.de.ibm.com (d12lmsgate-3.de.ibm.com [195.212.91.201])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA6G2C024114
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 08:02:13 -0800
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-3.de.ibm.com (1.0.0) with ESMTP id RAA26776
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 17:02:05 +0100
Received: from d12ml033.de.ibm.com (d12ml033_cs0 [9.165.223.11])
	by d12relay02.de.ibm.com (8.11.1m3/NCO v4.98) with ESMTP id fA6G25J23538
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 17:02:05 +0100
Subject: Re: [lkcd-general] lkcd license?
To: "James Washer" <washer@us.ibm.com>, lkcd@oss.sgi.com
X-Mailer: Lotus Notes Release 5.0.4a  July 24, 2000
Message-ID: <OFEE075679.FB22514B-ON41256AFC.00524135@de.ibm.com>
From: "Andreas Herrmann" <AHERRMAN@de.ibm.com>
X-MIMETrack: Serialize by Router on D12ML033/12/M/IBM(Release 5.0.8 |June 18, 2001) at
 06/11/2001 17:02:04
MIME-Version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: quoted-printable
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov  6 08:03:01 2001
X-Original-Date: Tue, 6 Nov 2001 17:02:03 +0100

Hi,

Before lkcd moved completely from oss.sgi.com to sourceforge.net you co=
uld
find license information on oss.sgi.com/projects/lkcd -
like it is done for kgdb (see
http://oss.sgi.com/projects/kgdb/license.html).
There it was stated that lkcd is released under GPL.

Currently on sourceforge you find license information at web page
http://sourceforge.net/projects/lkcd.
But I really don't =19know which parts of lkcd may be released under LG=
PL
(?).

I think we have to introduce README files into the directory structure =
on
sourceforge containing license information for lkcd.


Regards,

Andreas

--
Linux for eServer Development
Tel :  +49-7031-16-4640
Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
email :  aherrman@de.ibm.com



|--------+---------------------------------------->
|        |          James                         |
|        |          Washer/Beaverton/IBM@IBMUS    |
|        |          Sent by:                      |
|        |          lkcd-general-admin@lists.sourc|
|        |          eforge.net                    |
|        |                                        |
|        |                                        |
|        |          11/06/01 02:44 AM             |
|        |          Please respond to James Washer|
|        |                                        |
|--------+---------------------------------------->
  >--------------------------------------------------------------------=
---------------------------------|
  |                                                                    =
                                 |
  |      To:     lkcd@oss.sgi.com                                      =
                                 |
  |      cc:                                                           =
                                 |
  |      Subject:     [lkcd-general] lkcd license?                     =
                                 |
  |                                                                    =
                                 |
  |                                                                    =
                                 |
  >--------------------------------------------------------------------=
---------------------------------|



I've looked through the source, and the web page.. and I've not been ab=
le
to determine what license "lkcd" is released under..

Does anyone know?

 - jim


_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general


=




From lkcd-general-owner@lists.sourceforge.net Tue Nov 06 11:29:33 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 161Buf-0005Ea-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 06 Nov 2001 11:29:33 -0800
Received: from fcexgw03.efi.com ([192.68.228.82])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA6JTW008154
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 11:29:32 -0800
Received: from 10.10.12.116 by fcexgw03.efi.com (InterScan E-Mail VirusWall NT); Tue, 06 Nov 2001 11:29:17 -0800
Received: by ex-imc2-corp.efi.com with Internet Mail Service (5.5.2653.19)
	id <WKZLSGWZ>; Tue, 6 Nov 2001 11:29:23 -0800
Message-ID: <D9F6B9DABA4CAE4B92850252C52383AB02AA21A9@ex-eng-corp>
From: Kallol Biswas <Kallol.Biswas@efi.com>
To: "'lkcd@oss.sgi.com'" <lkcd@oss.sgi.com>
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: multipart/mixed;
	boundary="------------InterScan_NT_MIME_Boundary"
Subject: [lkcd-general] crash dump on 2.4 kernel
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov  6 11:30:07 2001
X-Original-Date: Tue, 6 Nov 2001 11:29:31 -0800

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--------------InterScan_NT_MIME_Boundary
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C166F9.5BA38EE0"

------_=_NextPart_001_01C166F9.5BA38EE0
Content-Type: text/plain;
	charset="iso-8859-1"

Hi,
   I am trying to get a crash dump on a 2.4 kernel, the system has an IDE
disk?
Does crash dump work on an IDE disk? Also is there an archive for the
mailing
list lkcd@oss.sgi.com? The lkcdutils binary and source files are not working
for me.
Is there a tar archive for these packages?

Regards,
Kallol Biswas

------_=_NextPart_001_01C166F9.5BA38EE0
Content-Type: text/html;
	charset="iso-8859-1"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 5.5.2654.45">
<TITLE>crash dump on 2.4 kernel</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=2 FACE="Arial">Hi,</FONT>
<BR><FONT SIZE=2 FACE="Arial">&nbsp;&nbsp; I am trying to get a crash dump on a 2.4 kernel, the system has an IDE disk?</FONT>
<BR><FONT SIZE=2 FACE="Arial">Does crash dump work on an IDE disk? Also is there an archive for the mailing</FONT>
<BR><FONT SIZE=2 FACE="Arial">list lkcd@oss.sgi.com? The lkcdutils binary and source files are not working for me.</FONT>
<BR><FONT SIZE=2 FACE="Arial">Is there a tar archive for these packages?</FONT>
</P>

<P><FONT SIZE=2 FACE="Arial">Regards,</FONT>
<BR><FONT SIZE=2 FACE="Arial">Kallol Biswas</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C166F9.5BA38EE0--

--------------InterScan_NT_MIME_Boundary--



From lkcd-general-owner@lists.sourceforge.net Tue Nov 06 12:50:54 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 161DBO-0003dO-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 06 Nov 2001 12:50:54 -0800
Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA6Kor010045
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 12:50:53 -0800
Received: from alacritech.com (lambda.alacritech.com [10.1.1.32])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fA6KnrK28292;
	Tue, 6 Nov 2001 12:49:53 -0800
Message-ID: <3BE84E94.AD1F9FDC@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2smp i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Andreas Herrmann <AHERRMAN@de.ibm.com>
CC: James Washer <washer@us.ibm.com>, lkcd@oss.sgi.com
Subject: Re: [lkcd-general] lkcd license?
References: <OFEE075679.FB22514B-ON41256AFC.00524135@de.ibm.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov  6 12:51:07 2001
X-Original-Date: Tue, 06 Nov 2001 12:56:52 -0800

There's a short and long answer here.

Short answer: Everything is under GPL except libraries for lcrash, which
              are LGPL.  This allows someone to build their own programs
              against the libraries if they need to without worrying about
              GPL violations.

<slight rant>
Long answer:  While the short answer sufficies for most people, there are
              people looking to expand LKCD to work for proprietary
              objectives.  I think it's a GOOD thing to allow for open
              source AND closed source to work together simultaneously.

              If we need to create a mechanism by which someone can
              include their own proprietary objects, modules, etc., then
              let's do it.  I am sick and tired of this "use open source or
              go away" crap -- you can do both and still make a great product
              that works for everyone, that is seamless, clean, interoperable,
              maintainable, etc., etc., etc.  Most people with binary
              modules or applications for Linux have problems because the
              open source community is so enchanted with their own idealism
              that they forget their reasons for writing code in the
              first place.  I do this for the love of RAS, not for the need
              to make a political or ideological statement.  Hopefully some
              system running a cancer research program, or a clean-fuel-cell
              simulation, or anything like that, when it crashes, LKCD helps
              them get back up and running more quickly.  That's what it's
              all about.
</slight rant>

--Matt

Andreas Herrmann wrote:
> 
> Hi,
> 
> Before lkcd moved completely from oss.sgi.com to sourceforge.net you could
> find license information on oss.sgi.com/projects/lkcd -
> like it is done for kgdb (see
> http://oss.sgi.com/projects/kgdb/license.html).
> There it was stated that lkcd is released under GPL.
> 
> Currently on sourceforge you find license information at web page
> http://sourceforge.net/projects/lkcd.
> But I really don't know which parts of lkcd may be released under LGPL
> (?).
> 
> I think we have to introduce README files into the directory structure on
> sourceforge containing license information for lkcd.
> 
> Regards,
> 
> Andreas
> 
> --
> Linux for eServer Development
> Tel :  +49-7031-16-4640
> Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
> email :  aherrman@de.ibm.com
> 
> |--------+---------------------------------------->
> |        |          James                         |
> |        |          Washer/Beaverton/IBM@IBMUS    |
> |        |          Sent by:                      |
> |        |          lkcd-general-admin@lists.sourc|
> |        |          eforge.net                    |
> |        |                                        |
> |        |                                        |
> |        |          11/06/01 02:44 AM             |
> |        |          Please respond to James Washer|
> |        |                                        |
> |--------+---------------------------------------->
>   >-----------------------------------------------------------------------------------------------------|
>   |                                                                                                     |
>   |      To:     lkcd@oss.sgi.com                                                                       |
>   |      cc:                                                                                            |
>   |      Subject:     [lkcd-general] lkcd license?                                                      |
>   |                                                                                                     |
>   |                                                                                                     |
>   >-----------------------------------------------------------------------------------------------------|
> 
> I've looked through the source, and the web page.. and I've not been able
> to determine what license "lkcd" is released under..
> 
> Does anyone know?
> 
>  - jim
> 
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general
> 
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general


From lkcd-general-owner@lists.sourceforge.net Tue Nov 06 12:51:36 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 161DC3-0003k6-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 06 Nov 2001 12:51:35 -0800
Received: from smtp.alacritech.com (smtp.alacritech.com [209.10.208.82])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA6KpZ010055
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 12:51:35 -0800
Received: from alacritech.com (lambda.alacritech.com [10.1.1.32])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fA6KodK28310;
	Tue, 6 Nov 2001 12:50:39 -0800
Message-ID: <3BE84EC3.8C2C9750@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2smp i686)
X-Accept-Language: en
MIME-Version: 1.0
To: Kallol Biswas <Kallol.Biswas@efi.com>
CC: "'lkcd@oss.sgi.com'" <lkcd@oss.sgi.com>
Subject: Re: [lkcd-general] crash dump on 2.4 kernel
References: <D9F6B9DABA4CAE4B92850252C52383AB02AA21A9@ex-eng-corp>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov  6 12:52:04 2001
X-Original-Date: Tue, 06 Nov 2001 12:57:39 -0800

> Kallol Biswas wrote:
> 
> Hi,
>    I am trying to get a crash dump on a 2.4 kernel, the system has an IDE
> disk?
> Does crash dump work on an IDE disk? Also is there an archive for the
> mailing
> list lkcd@oss.sgi.com? The lkcdutils binary and source files are not working
> for me.
> Is there a tar archive for these packages?
> 
> Regards,
> Kallol Biswas

Crash dumps work on IDE disks with the 4.0 code.  Are you having some 
problem with the 4.0 RPMs for a reason?  Is it because of the version
of your RPM program, or ... ?

--Matt


From lkcd-general-owner@lists.sourceforge.net Tue Nov 06 14:41:20 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 161EuG-0002ND-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 06 Nov 2001 14:41:20 -0800
Received: from sgi.com (sgi.SGI.COM [192.48.153.1])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fA6MfJ015684
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 14:41:19 -0800
Received: from fcexgw03.efi.com ([192.68.228.82]) 
	by sgi.com (980327.SGI.8.8.8-aspam/980304.SGI-aspam:
       SGI does not authorize the use of its proprietary
       systems or networks for unsolicited or bulk email
       from the Internet.) 
	via SMTP id OAA08142
	for <lkcd@oss.sgi.com>; Tue, 6 Nov 2001 14:41:19 -0800 (PST)
	mail_from (Kallol.Biswas@efi.com)
Received: from 10.10.12.104 by fcexgw03.efi.com (InterScan E-Mail VirusWall NT); Tue, 06 Nov 2001 14:38:50 -0800
Received: by EX-IMC3-CORP.efi.com with Internet Mail Service (5.5.2653.19)
	id <WLCNAFZW>; Tue, 6 Nov 2001 14:39:09 -0800
Message-ID: <D9F6B9DABA4CAE4B92850252C52383AB02AA21AD@ex-eng-corp>
From: Kallol Biswas <Kallol.Biswas@efi.com>
To: "'Matt D. Robinson'" <yakker@alacritech.com>,
   Kallol Biswas
	 <Kallol.Biswas@efi.com>
Cc: "'lkcd@oss.sgi.com'" <lkcd@oss.sgi.com>
Subject: RE: [lkcd-general] crash dump on 2.4 kernel
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: multipart/mixed;
	boundary="------------InterScan_NT_MIME_Boundary"
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov  6 14:42:02 2001
X-Original-Date: Tue, 6 Nov 2001 14:39:03 -0800

This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--------------InterScan_NT_MIME_Boundary
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C16713.D5D39F60"

------_=_NextPart_001_01C16713.D5D39F60
Content-Type: text/plain;
	charset="iso-8859-1"


# rpm --version
RPM version 4.0.2

My system is running debian distribution of 2.4.7 kernel.


# rpm -ivh  lkcdutils-4.0-1.i386.rpm
error: cannot open Packages index using db3 - No such file or directory (2)
error: cannot open Packages database in /var/lib/rpm

Also when the source code is compiled a few " shift/reduce & reduce/reduce
conflict"
messages  from yacc  are generated.
The out put is attached below:

cc -D_FILE_OFFSET_BITS=64 -gstabs -Wall -DARCH=i386 -I/usr/src/linux/include
gcc 
-D_FILE_OFFSET_BITS=64 -gstabs -Wall -DARCH=i386 -I/usr/src/linux/include
yacc -psialpp -v -t -d sialpp.y
conflicts:  23 shift/reduce
cat y.tab.c | sed -f sialpp-lsed > sialpp.tab.c
cat y.tab.h | sed -f sialpp-lsed > sialpp.tab.h
gcc -gstabs -c sialpp.tab.c
sialpp.y: In function `sial_getppnode':
sialpp.y:83: `yyval' undeclared (first use in this function)
sialpp.y:83: (Each undeclared identifier is reported only once
sialpp.y:83: for each function it appears in.)
make[1]: *** [sialpp.tab.o] Error 1
make[1]: Leaving directory `/usr/src/lkcdutils-4.0/libsial'
make[1]: Entering directory `/usr/src/lkcdutils-4.0/lcrash'
/bin/rm -f ./include/arch
(cd include ; /bin/ln -sf arch-i386 arch)
(cd ./../libklib ; make ARCH=i386 symlinks)
make[2]: Entering directory `/usr/src/lkcdutils-4.0/libklib'









"Matt D. Robinson" wrote:

  > Kallol Biswas wrote:
  >
  > Hi,
  >    I am trying to get a crash dump on a 2.4 kernel, the system has an
IDE
  > disk?
  > Does crash dump work on an IDE disk? Also is there an archive for the
  > mailing
  > list lkcd@oss.sgi.com? The lkcdutils binary and source files are not
working
  > for me.
  > Is there a tar archive for these packages?
  >
  > Regards,
  > Kallol Biswas

  Crash dumps work on IDE disks with the 4.0 code.  Are you having some
  problem with the 4.0 RPMs for a reason?  Is it because of the version
  of your RPM program, or ... ?

  --Matt

-----Original Message-----
From: Matt D. Robinson [mailto:yakker@alacritech.com]
Sent: Tuesday, November 06, 2001 12:58 PM
To: Kallol Biswas
Cc: 'lkcd@oss.sgi.com'
Subject: Re: [lkcd-general] crash dump on 2.4 kernel


> Kallol Biswas wrote:
> 
> Hi,
>    I am trying to get a crash dump on a 2.4 kernel, the system has an IDE
> disk?
> Does crash dump work on an IDE disk? Also is there an archive for the
> mailing
> list lkcd@oss.sgi.com? The lkcdutils binary and source files are not
working
> for me.
> Is there a tar archive for these packages?
> 
> Regards,
> Kallol Biswas

Crash dumps work on IDE disks with the 4.0 code.  Are you having some 
problem with the 4.0 RPMs for a reason?  Is it because of the version
of your RPM program, or ... ?

--Matt

------_=_NextPart_001_01C16713.D5D39F60
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2654.45">
<TITLE>RE: [lkcd-general] crash dump on 2.4 kernel</TITLE>
</HEAD>
<BODY>
<BR>

<P><FONT SIZE=3D2># rpm --version</FONT>
<BR><FONT SIZE=3D2>RPM version 4.0.2</FONT>
</P>

<P><FONT SIZE=3D2>My system is running debian distribution of 2.4.7 =
kernel.</FONT>
</P>
<BR>

<P><FONT SIZE=3D2># rpm -ivh&nbsp; lkcdutils-4.0-1.i386.rpm</FONT>
<BR><FONT SIZE=3D2>error: cannot open Packages index using db3 - No =
such file or directory (2)</FONT>
<BR><FONT SIZE=3D2>error: cannot open Packages database in =
/var/lib/rpm</FONT>
</P>

<P><FONT SIZE=3D2>Also when the source code is compiled a few &quot; =
shift/reduce &amp; reduce/reduce conflict&quot;</FONT>
<BR><FONT SIZE=3D2>messages&nbsp; from yacc&nbsp; are generated.</FONT>
<BR><FONT SIZE=3D2>The out put is attached below:</FONT>
</P>

<P><FONT SIZE=3D2>cc -D_FILE_OFFSET_BITS=3D64 -gstabs -Wall =
-DARCH=3Di386 -I/usr/src/linux/include&nbsp;&nbsp; gcc </FONT>
<BR><FONT SIZE=3D2>-D_FILE_OFFSET_BITS=3D64 -gstabs -Wall -DARCH=3Di386 =
-I/usr/src/linux/include&nbsp;&nbsp; yacc -psialpp -v -t -d =
sialpp.y</FONT>
<BR><FONT SIZE=3D2>conflicts:&nbsp; 23 shift/reduce</FONT>
<BR><FONT SIZE=3D2>cat y.tab.c | sed -f sialpp-lsed &gt; =
sialpp.tab.c</FONT>
<BR><FONT SIZE=3D2>cat y.tab.h | sed -f sialpp-lsed &gt; =
sialpp.tab.h</FONT>
<BR><FONT SIZE=3D2>gcc -gstabs -c sialpp.tab.c</FONT>
<BR><FONT SIZE=3D2>sialpp.y: In function `sial_getppnode':</FONT>
<BR><FONT SIZE=3D2>sialpp.y:83: `yyval' undeclared (first use in this =
function)</FONT>
<BR><FONT SIZE=3D2>sialpp.y:83: (Each undeclared identifier is reported =
only once</FONT>
<BR><FONT SIZE=3D2>sialpp.y:83: for each function it appears =
in.)</FONT>
<BR><FONT SIZE=3D2>make[1]: *** [sialpp.tab.o] Error 1</FONT>
<BR><FONT SIZE=3D2>make[1]: Leaving directory =
`/usr/src/lkcdutils-4.0/libsial'</FONT>
<BR><FONT SIZE=3D2>make[1]: Entering directory =
`/usr/src/lkcdutils-4.0/lcrash'</FONT>
<BR><FONT SIZE=3D2>/bin/rm -f ./include/arch</FONT>
<BR><FONT SIZE=3D2>(cd include ; /bin/ln -sf arch-i386 arch)</FONT>
<BR><FONT SIZE=3D2>(cd ./../libklib ; make ARCH=3Di386 symlinks)</FONT>
<BR><FONT SIZE=3D2>make[2]: Entering directory =
`/usr/src/lkcdutils-4.0/libklib'</FONT>
</P>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>
<BR>

<P><FONT SIZE=3D2>&quot;Matt D. Robinson&quot; wrote:</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp; &gt; Kallol Biswas wrote:</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt;</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; Hi,</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt;&nbsp;&nbsp;&nbsp; I am trying to get a =
crash dump on a 2.4 kernel, the system has an IDE</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; disk?</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; Does crash dump work on an IDE disk? =
Also is there an archive for the</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; mailing</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; list lkcd@oss.sgi.com? The lkcdutils =
binary and source files are not working</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; for me.</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; Is there a tar archive for these =
packages?</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt;</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; Regards,</FONT>
<BR><FONT SIZE=3D2>&nbsp; &gt; Kallol Biswas</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp; Crash dumps work on IDE disks with the 4.0 =
code.&nbsp; Are you having some</FONT>
<BR><FONT SIZE=3D2>&nbsp; problem with the 4.0 RPMs for a reason?&nbsp; =
Is it because of the version</FONT>
<BR><FONT SIZE=3D2>&nbsp; of your RPM program, or ... ?</FONT>
</P>

<P><FONT SIZE=3D2>&nbsp; --Matt</FONT>
</P>

<P><FONT SIZE=3D2>-----Original Message-----</FONT>
<BR><FONT SIZE=3D2>From: Matt D. Robinson [<A =
HREF=3D"mailto:yakker@alacritech.com">mailto:yakker@alacritech.com</A>]<=
/FONT>
<BR><FONT SIZE=3D2>Sent: Tuesday, November 06, 2001 12:58 PM</FONT>
<BR><FONT SIZE=3D2>To: Kallol Biswas</FONT>
<BR><FONT SIZE=3D2>Cc: 'lkcd@oss.sgi.com'</FONT>
<BR><FONT SIZE=3D2>Subject: Re: [lkcd-general] crash dump on 2.4 =
kernel</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>&gt; Kallol Biswas wrote:</FONT>
<BR><FONT SIZE=3D2>&gt; </FONT>
<BR><FONT SIZE=3D2>&gt; Hi,</FONT>
<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp; I am trying to get a crash =
dump on a 2.4 kernel, the system has an IDE</FONT>
<BR><FONT SIZE=3D2>&gt; disk?</FONT>
<BR><FONT SIZE=3D2>&gt; Does crash dump work on an IDE disk? Also is =
there an archive for the</FONT>
<BR><FONT SIZE=3D2>&gt; mailing</FONT>
<BR><FONT SIZE=3D2>&gt; list lkcd@oss.sgi.com? The lkcdutils binary and =
source files are not working</FONT>
<BR><FONT SIZE=3D2>&gt; for me.</FONT>
<BR><FONT SIZE=3D2>&gt; Is there a tar archive for these =
packages?</FONT>
<BR><FONT SIZE=3D2>&gt; </FONT>
<BR><FONT SIZE=3D2>&gt; Regards,</FONT>
<BR><FONT SIZE=3D2>&gt; Kallol Biswas</FONT>
</P>

<P><FONT SIZE=3D2>Crash dumps work on IDE disks with the 4.0 =
code.&nbsp; Are you having some </FONT>
<BR><FONT SIZE=3D2>problem with the 4.0 RPMs for a reason?&nbsp; Is it =
because of the version</FONT>
<BR><FONT SIZE=3D2>of your RPM program, or ... ?</FONT>
</P>

<P><FONT SIZE=3D2>--Matt</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C16713.D5D39F60--

--------------InterScan_NT_MIME_Boundary--



From nava@core.rose.hp.com Wed Nov 07 18:51:01 2001
Received: from palrel1.hp.com ([156.153.255.242])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 161fEW-0006Au-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 07 Nov 2001 18:48:00 -0800
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by palrel1.hp.com (Postfix) with ESMTP id 5A625E8A
	for <lkcd-general@lists.sourceforge.net>; Wed,  7 Nov 2001 18:44:52 -0800 (PST)
Received: (from nava@localhost) by core.rose.hp.com (8.9.3 (PHNE_22672)/8.8.6 SMKit7.02) id SAA12523 for lkcd-general@lists.sourceforge.net; Wed, 7 Nov 2001 18:46:19 -0800 (PST)
From: Nava Navaruparajah <nava@core.rose.hp.com>
Message-Id: <200111080246.SAA12523@core.rose.hp.com>
To: lkcd-general@lists.sourceforge.net
X-Mailer: Elm [revision: 212.5]
Subject: [lkcd-general] FAQ is missing
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov  7 18:51:03 2001
X-Original-Date: Wed, 07 Nov 2001 18:46:19 PST

Hi,

  I want to try the latest lkcd - 4.0. However, the FAQ is
missing in http://lkcd.sourceforge.net/faq.html

  Can someone point me the right FAQ for getting started with
the latest lkcd.

thanks,
nava


From nava@core.rose.hp.com Fri Nov 09 14:38:03 2001
Received: from palrel10.hp.com ([156.153.255.245])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 162KEp-0004et-00
	for <lkcd-general@lists.sourceforge.net>; Fri, 09 Nov 2001 14:35:03 -0800
Received: from core.rose.hp.com (core.rose.hp.com [15.43.208.100])
	by palrel10.hp.com (Postfix) with ESMTP id E435F1F641
	for <lkcd-general@lists.sourceforge.net>; Fri,  9 Nov 2001 14:31:38 -0800 (PST)
Received: (from nava@localhost) by core.rose.hp.com (8.9.3 (PHNE_22672)/8.8.6 SMKit7.02) id OAA16306; Fri, 9 Nov 2001 14:33:04 -0800 (PST)
From: Nava Navaruparajah <nava@core.rose.hp.com>
Message-Id: <200111092233.OAA16306@core.rose.hp.com>
To: lkcd-general@lists.sourceforge.net
Cc: nava@core.rose.hp.com
X-Mailer: Elm [revision: 212.5]
Subject: [lkcd-general] LKCD 4.0 not working for me
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Fri Nov  9 14:39:01 2001
X-Original-Date: Fri, 09 Nov 2001 14:33:03 PST

Hi,

   I tried latest LKCD 4.0 patched onto 2.4.8 kernel. However the
system hangs while dump being generated. After system crash,
it says "writing Header pages .....". Then hangs while doing
"writing pages..... "..  

My setup:
dual pentium processore system. On top of RedHat 7.1, I have
linux 2.4.8. 

Am I missing anything?

Thanks,
nava


From lkcd-general-owner@lists.sourceforge.net Mon Nov 12 21:29:14 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 163W7n-0007yX-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 12 Nov 2001 21:28:43 -0800
Received: from ureach.com (IDENT:root@mail.ureach.com [63.150.151.36])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAD5Sg027397
	for <lkcd@oss.sgi.com>; Mon, 12 Nov 2001 21:28:43 -0800
Received: from www20.ureach.com (IDENT:root@www20.ureach.com [172.16.2.48])
	by ureach.com (8.9.1/8.8.5) with ESMTP id AAA21191
	for <lkcd@oss.sgi.com>; Tue, 13 Nov 2001 00:28:38 -0500
Received: (from nobody@localhost)
	by www20.ureach.com (8.9.3/8.9.1) id AAA24441;
	Tue, 13 Nov 2001 00:28:39 -0500
Message-Id: <200111130528.AAA24441@www20.ureach.com>
To: lkcd@oss.sgi.com
From: Kapish K <kapish@ureach.com>
Reply-to: <kapish@ureach.com>
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-vsuite-type: e
Subject: [lkcd-general] dump and highmem
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 12 21:30:01 2001
X-Original-Date: Tue, 13 Nov 2001 00:28:39 -0500

Hello,
	While trying to use lkcd and lcrash ( 4.0 ) on dumps from
highmem enbaled boxes, a colleague noticed what might be a bug
in the lkcd code.
The error seems to occur when lcrash looks at the headrrs of
loaded modules in the dump file, one of which is mapped at the
highmemory region of physical memory. 
The dp_address field ( in add_dump_page ) is the virtual address
( obtained frpm page_address(p) which gets the page->vitual
address ), but for pages in highmemory, this would be zero
unless the page was kampped at that point in time during dump.
Is that right?
When lcrash starts, it seems to build an index of physical
pages, and
it uses the dp_address fields to determine the real memory
address by subtracting the kernel page offset (usually
0xC0000000).  Thus, the real memory address of the high memory
pages seem to be incorrect.
We fixed this by changing the code so that dp_address is a
real memory address rather than virtual.
so, the changes we did were the following:
in dump_base.c:
--- drivers/dump/dump_base.c.orig	Wed Nov  7 02:54:53 2001
+++ drivers/dump/dump_base.c	Fri Nov  9 02:24:05 2001
@@ -486,13 +486,12 @@
 #if defined(CONFIG_X86) || defined(CONFIG_ALPHA)
 	extern int page_is_ram(unsigned long);
 #endif
-	unsigned long addr, size;
+	unsigned long size;
 	dump_page_t dp;
 	struct page *p = (struct page *)&(mem_map[mem_loc]);
 	void *vaddr;
 
-	addr = (unsigned long)page_address(p);
-	dp.dp_address = (uint64_t)addr;
+	dp.dp_address = (uint64_t)mem_loc << PAGE_SHIFT;
 	dp.dp_flags = DUMP_DH_RAW;
 
 	/*
in lkcdutils-1.0-7.src.rpm:

--- libklib/kl_cmp.c.orig	Sat Jun 16 22:50:10 2001
+++ libklib/kl_cmp.c	Thu Nov  8 18:11:43 2001
@@ -920,7 +920,7 @@
 {
 	kaddr_t paddr;
 
-	paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
+	paddr = (kaddr_t)dp->dp_address;
 	return (paddr);
 }

This seems to fix our problems with being able to look at trace
records for pages in high memory.
What I am looking for is whether this has already been
identified by the lckd team as a problem, and if so, has the fix
you plan the same? if this is not a problem, what have we missd
in here? And finally, if this is a problem and the the solution
is acceptable, could this change get into lkcd?
TIA

________________________________________________
Get your own "800" number
Voicemail, fax, email, and a lot more
http://www.ureach.com/reg/tag


From bsuparna@in.ibm.com Mon Nov 12 22:15:34 2001
Received: from ausmtp02.au.ibm.com ([202.135.136.105])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 163Wo5-0005Jz-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 12 Nov 2001 22:14:58 -0800
Received: from f02n15e.au.ibm.com 
        by ausmtp02.au.ibm.com (IBM AP 2.0) with ESMTP id fAD66e2653760
        for <lkcd-general@lists.sourceforge.net>; Tue, 13 Nov 2001 17:06:47 +1100
Received: from d23hubm4.au.ibm.com (f01n11s [9.185.166.35])
	by f02n15e.au.ibm.com (8.11.1m3/NCO v4.98) with ESMTP id fAD6AK684884
	for <lkcd-general@lists.sourceforge.net>; Tue, 13 Nov 2001 17:10:20 +1100
X-Priority: 1 (High)
Importance: Normal
Subject: Re: [lkcd-general] LKCD 4.0 not working for me
To: Nava Navaruparajah <nava@core.rose.hp.com>
Cc: "lkcd-general" <lkcd-general@lists.sourceforge.net>
From: Suparna_Bhattacharya/India/IBM%IBMIN <bsuparna@in.ibm.com>
Message-ID: <OF82412B9E.FD082CF1-ON65256B03.001F8C83@au.ibm.com>
X-MIMETrack: Serialize by Router on D23HUBM4/23/H/IBM(Release 5.0.8 |June 18, 2001) at
 13/11/2001 17:10:20
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 12 22:16:03 2001
X-Original-Date: Tue, 13 Nov 2001 11:20:55 +0530


Hello Nava,

Could you tell us a little more about the situation under which the dump
was triggered  ?  (Under what circumstances did the crash occur ? Do you
have an idea of where it might have crashed ?)

Regards
Suparna


  Suparna Bhattacharya
  Linux Technology Center
  IBM Software Lab, India
  E-mail : bsuparna@in.ibm.com
  Phone :  91-80-5044961


Nava Navaruparajah <nava@core.rose.hp.com> on 11/10/2001 04:03:03 AM

Please respond to Nava Navaruparajah <nava@core.rose.hp.com>

To:   lkcd-general@lists.sourceforge.net
cc:   nava@core.rose.hp.com (bcc: Suparna Bhattacharya/India/IBM)
Subject:  [lkcd-general] LKCD 4.0 not working for me




Hi,

   I tried latest LKCD 4.0 patched onto 2.4.8 kernel. However the
system hangs while dump being generated. After system crash,
it says "writing Header pages .....". Then hangs while doing
"writing pages..... "..

My setup:
dual pentium processore system. On top of RedHat 7.1, I have
linux 2.4.8.

Am I missing anything?

Thanks,
nava

_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general





From lkcd-general-owner@lists.sourceforge.net Mon Nov 12 23:37:17 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 163Y87-0008Qr-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 12 Nov 2001 23:37:11 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAD7bA029626
	for <lkcd@oss.sgi.com>; Mon, 12 Nov 2001 23:37:11 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAD7evX13632;
	Mon, 12 Nov 2001 23:40:57 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Kapish K <kapish@ureach.com>
cc: <lkcd@oss.sgi.com>
Subject: Re: [lkcd-general] dump and highmem
In-Reply-To: <200111130528.AAA24441@www20.ureach.com>
Message-ID: <Pine.LNX.4.30.0111122330580.13510-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 12 23:38:02 2001
X-Original-Date: Mon, 12 Nov 2001 23:40:57 -0800 (PST)

Hey, Kapish.  Let me take a look at this and let you know.
I wasn't aware the state of page->virtual address would
ever be NULL unless there were no more highmem pages left.
Can you tell me how you're triggering this, or is it just
specific to random dumps?

The fix seems reasonable, BTW, but I have to test this
with non-highmem as well as highmem.  Did you try this
with a non-highmem kernel?

Thanks, Kapish.

--Matt

On Tue, 13 Nov 2001, Kapish K wrote:
|>Hello,
|>	While trying to use lkcd and lcrash ( 4.0 ) on dumps from
|>highmem enbaled boxes, a colleague noticed what might be a bug
|>in the lkcd code.




From alex_aminoff@alum.mit.edu Tue Nov 13 17:35:29 2001
Received: from gibraltar.basespace.net ([207.106.87.2])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 163oxc-0003W6-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 13 Nov 2001 17:35:28 -0800
Received: by gibraltar.basespace.net (Postfix, from userid 503)
	id 15487189AEF; Tue, 13 Nov 2001 20:35:25 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
	by gibraltar.basespace.net (Postfix) with ESMTP id 0B3A6175CC2
	for <lkcd-general@lists.sourceforge.net>; Tue, 13 Nov 2001 20:35:25 -0500 (EST)
From: Alex Aminoff <alex_aminoff@alum.mit.edu>
X-X-Sender:  <alex@gibraltar.basespace.net>
To: <lkcd-general@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.33.0111132029220.2215-100000@gibraltar.basespace.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [lkcd-general] Reporting a very simple config bug
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 13 17:36:04 2001
X-Original-Date: Tue, 13 Nov 2001 20:35:25 -0500 (EST)

When looking for the swap device to use as a dump device, it would appear
that the installation process does not exclude lines in /etc/fstab that
are commented out. As it happens I had an old swap partition commented
out, like this:

#/dev/hda5          swap             swap    defaults        0 0
/dev/hdb5          swap             swap    defaults        0 0

Which led to /dev/vmdump being linked to a nonexistent file, like this:

lrwxrwxrwx    1 root     root       10 Nov 13 20:24 vmdump -> #/dev/hda5

When I removed the comment line from fstab and re-ran lkcd_config, it
worked just fine, creating the correct link.

A very minor bug but one which should be easy for you to fix.

Thanks,

 - Alex Aminoff
   BaseSpace.net



From lkcd-general-owner@lists.sourceforge.net Wed Nov 14 08:37:32 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 16432X-000219-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 08:37:29 -0800
Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com [32.97.182.101])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAEGbT002639
	for <lkcd@oss.sgi.com>; Wed, 14 Nov 2001 08:37:29 -0800
Received: from northrelay02.pok.ibm.com (northrelay02.pok.ibm.com [9.117.200.22])
	by e1.ny.us.ibm.com (8.9.3/8.9.3) with ESMTP id LAA483992
	for <lkcd@oss.sgi.com>; Wed, 14 Nov 2001 11:34:28 -0500
Received: from d01mlc96.pok.ibm.com (d01mlc96.pok.ibm.com [9.117.250.33])
	by northrelay02.pok.ibm.com (8.11.1m3/NCO v5.00) with ESMTP id fAEGawY38512
	for <lkcd@oss.sgi.com>; Wed, 14 Nov 2001 11:36:58 -0500
Importance: Normal
Sensitivity: 
To: lkcd@oss.sgi.com
X-Mailer: Lotus Notes Release 5.0.5  September 22, 2000
Message-ID: <OFA5BF63F5.B9904444-ON85256B04.005A4226@pok.ibm.com>
From: "Paul Sutera" <psutera@us.ibm.com>
X-MIMETrack: Serialize by Router on D01MLC96/01/M/IBM(Build V509_10152001.dev02 |October
 31, 2001) at 11/14/2001 11:36:59 AM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Subject: [lkcd-general] (no subject)
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 08:38:08 2001
X-Original-Date: Wed, 14 Nov 2001 11:26:17 -0500

Hi,

I'm confused by this statement in your September 2001 news:
"lkcdutils will now have the kernel patch in it; the release number of
lkcdutils will be
tied directly to the kernel patch as a result."
Does this mean I don't need the kernel patch if I install the rpm for the
lkcdutils?
I am on Linux kernel 2.4.7 for s390.   Do I need 2.4.8 or only if I wanted
to use
the patch?

Thanks,

 Paul Sutera
Dept B9LD/P385   Phone: (845)435-1925



From lkcd-general-owner@lists.sourceforge.net Wed Nov 14 09:32:35 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1643tp-0006PS-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 09:32:33 -0800
Received: from d12lmsgate.de.ibm.com (d12lmsgate.de.ibm.com [195.212.91.199])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAEHWV022321
	for <lkcd@oss.sgi.com>; Wed, 14 Nov 2001 09:32:32 -0800
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id SAA265170;
	Wed, 14 Nov 2001 18:31:43 +0100
Received: from d12ml004.de.ibm.com (d12ml004_cs0 [9.165.223.50])
	by d12relay02.de.ibm.com (8.11.1m3/NCO v4.98) with ESMTP id fAEHVWF104356;
	Wed, 14 Nov 2001 18:31:33 +0100
Subject: Re: [lkcd-general] (no subject)
To: "Paul Sutera" <psutera@us.ibm.com>
Cc: lkcd@oss.sgi.com, lkcd-general-admin@lists.sourceforge.net
X-Mailer: Lotus Notes Release 5.0.3  March 21, 2000
Message-ID: <OF7DCED1AD.A8B18511-ONC1256B04.005F3BEC@de.ibm.com>
From: "Michael Holzheu" <HOLZHEU@de.ibm.com>
X-MIMETrack: Serialize by Router on D12ML004/12/M/IBM(Release 5.0.8 |June 18, 2001) at
 14/11/2001 18:31:38
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 09:33:02 2001
X-Original-Date: Wed, 14 Nov 2001 18:30:07 +0100

Paul,

For Linux/s390 you do not need the kernel patch.
The Kernel patch is for dump generation and is at the moment intel specific
(as far as I know).
On Linux/390 we have standalone dump tools for generating dumps.

But you need the Kerntypes-Kernel patch which you can obtain from
e.g.
http://www10.software.ibm.com/developerworks/opensource/linux390/current2_4.shtml.
After you applied this patch to the kernel, do a "make Kerntypes" and use
the built Kerntypes file
as input file for lcrash.

The lkcdutils rpm installs lcrash and some other small tools for dump
analysis
which you need to analyze dumps generated with the stand alone dump
tools.

You also can of course use lcrash on a live system using /dev/mem.

       Michael

------------------------------------------------------------------------
Linux/390 Development
Phone: +49-7031-16-2360,  Bld 71032-06-109
Email: holzheu@de.ibm.com



|--------+---------------------------------------->
|        |          Paul                          |
|        |          Sutera/Poughkeepsie/IBM@IBMUS |
|        |          Sent by:                      |
|        |          lkcd-general-admin@lists.sourc|
|        |          eforge.net                    |
|        |                                        |
|        |                                        |
|        |          11/14/01 05:26 PM             |
|        |          Please respond to Paul Sutera |
|        |                                        |
|--------+---------------------------------------->
  >----------------------------------------------------------------------------------------------------------|
  |                                                                                                          |
  |      To:     lkcd@oss.sgi.com                                                                            |
  |      cc:                                                                                                 |
  |      Subject:     [lkcd-general] (no subject)                                                            |
  |                                                                                                          |
  |                                                                                                          |
  >----------------------------------------------------------------------------------------------------------|



Hi,

I'm confused by this statement in your September 2001 news:
"lkcdutils will now have the kernel patch in it; the release number of
lkcdutils will be
tied directly to the kernel patch as a result."
Does this mean I don't need the kernel patch if I install the rpm for the
lkcdutils?
I am on Linux kernel 2.4.7 for s390.   Do I need 2.4.8 or only if I wanted
to use
the patch?

Thanks,

 Paul Sutera
Dept B9LD/P385   Phone: (845)435-1925


_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general






From lkcd-general-owner@lists.sourceforge.net Wed Nov 14 10:38:52 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1644vu-0004Am-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 10:38:46 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAEIcj031205
	for <lkcd@oss.sgi.com>; Wed, 14 Nov 2001 10:38:45 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAEIguG15266;
	Wed, 14 Nov 2001 10:42:56 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Michael Holzheu <HOLZHEU@de.ibm.com>
cc: Paul Sutera <psutera@us.ibm.com>, <lkcd@oss.sgi.com>,
   <lkcd-general-admin@lists.sourceforge.net>
Subject: Re: [lkcd-general] (no subject)
In-Reply-To: <OF7DCED1AD.A8B18511-ONC1256B04.005F3BEC@de.ibm.com>
Message-ID: <Pine.LNX.4.30.0111141041130.15261-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 10:39:03 2001
X-Original-Date: Wed, 14 Nov 2001 10:42:56 -0800 (PST)

Hi, Paul.  Based on some LKCD users/customers (such as IBM) not
needing to bundle the patch with the lkcdutils RPM, we have decided
to keep them separate.  We are, however, binding the revision
level of the lkcdutils RPM to the patch revision level.  That
wasn't always the case.

Right now, it's 4.0-1 for the lkcdutils RPM and 4.0 for the patch.
As soon as I'm done with the final check-ins, it'll be 4.0.1 for both.

--Matt

On Wed, 14 Nov 2001, Michael Holzheu wrote:
|>Paul,
|>
|>For Linux/s390 you do not need the kernel patch.
|>The Kernel patch is for dump generation and is at the moment intel specific
|>(as far as I know).
|>On Linux/390 we have standalone dump tools for generating dumps.
|>
|>But you need the Kerntypes-Kernel patch which you can obtain from
|>e.g.
|>http://www10.software.ibm.com/developerworks/opensource/linux390/current2_4.shtml.
|>After you applied this patch to the kernel, do a "make Kerntypes" and use
|>the built Kerntypes file
|>as input file for lcrash.
|>
|>The lkcdutils rpm installs lcrash and some other small tools for dump
|>analysis
|>which you need to analyze dumps generated with the stand alone dump
|>tools.
|>
|>You also can of course use lcrash on a live system using /dev/mem.
|>
|>       Michael
|>
|>------------------------------------------------------------------------
|>Linux/390 Development
|>Phone: +49-7031-16-2360,  Bld 71032-06-109
|>Email: holzheu@de.ibm.com
|>
|>
|>
|>|--------+---------------------------------------->
|>|        |          Paul                          |
|>|        |          Sutera/Poughkeepsie/IBM@IBMUS |
|>|        |          Sent by:                      |
|>|        |          lkcd-general-admin@lists.sourc|
|>|        |          eforge.net                    |
|>|        |                                        |
|>|        |                                        |
|>|        |          11/14/01 05:26 PM             |
|>|        |          Please respond to Paul Sutera |
|>|        |                                        |
|>|--------+---------------------------------------->
|>  >----------------------------------------------------------------------------------------------------------|
|>  |                                                                                                          |
|>  |      To:     lkcd@oss.sgi.com                                                                            |
|>  |      cc:                                                                                                 |
|>  |      Subject:     [lkcd-general] (no subject)                                                            |
|>  |                                                                                                          |
|>  |                                                                                                          |
|>  >----------------------------------------------------------------------------------------------------------|
|>
|>
|>
|>Hi,
|>
|>I'm confused by this statement in your September 2001 news:
|>"lkcdutils will now have the kernel patch in it; the release number of
|>lkcdutils will be
|>tied directly to the kernel patch as a result."
|>Does this mean I don't need the kernel patch if I install the rpm for the
|>lkcdutils?
|>I am on Linux kernel 2.4.7 for s390.   Do I need 2.4.8 or only if I wanted
|>to use
|>the patch?
|>
|>Thanks,
|>
|> Paul Sutera
|>Dept B9LD/P385   Phone: (845)435-1925



From yakker@aparity.com Wed Nov 14 10:41:15 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1644yD-0004wZ-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 10:41:09 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAEIjHt15351;
	Wed, 14 Nov 2001 10:45:18 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Alex Aminoff <alex_aminoff@alum.mit.edu>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] Reporting a very simple config bug
In-Reply-To: <Pine.LNX.4.33.0111132029220.2215-100000@gibraltar.basespace.net>
Message-ID: <Pine.LNX.4.30.0111141044490.15261-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 10:42:02 2001
X-Original-Date: Wed, 14 Nov 2001 10:45:17 -0800 (PST)

Fixed.  Added grep -v '^#' to the primary_swapdev variable
assignment.  You're right, this was a stupid bug. :)

--Matt

On Tue, 13 Nov 2001, Alex Aminoff wrote:
|>When looking for the swap device to use as a dump device, it would appear
|>that the installation process does not exclude lines in /etc/fstab that
|>are commented out. As it happens I had an old swap partition commented
|>out, like this:
|>
|>#/dev/hda5          swap             swap    defaults        0 0
|>/dev/hdb5          swap             swap    defaults        0 0
|>
|>Which led to /dev/vmdump being linked to a nonexistent file, like this:
|>
|>lrwxrwxrwx    1 root     root       10 Nov 13 20:24 vmdump -> #/dev/hda5
|>
|>When I removed the comment line from fstab and re-ran lkcd_config, it
|>worked just fine, creating the correct link.
|>
|>A very minor bug but one which should be easy for you to fix.
|>
|>Thanks,
|>
|> - Alex Aminoff
|>   BaseSpace.net
|>
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From yakker@aparity.com Wed Nov 14 10:58:05 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1645EV-0008Sn-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 10:57:59 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAEItpF15389;
	Wed, 14 Nov 2001 10:55:51 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Nava Navaruparajah <nava@core.rose.hp.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 4.0 not working for me
In-Reply-To: <200111092233.OAA16306@core.rose.hp.com>
Message-ID: <Pine.LNX.4.30.0111141055350.15261-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 10:59:02 2001
X-Original-Date: Wed, 14 Nov 2001 10:55:51 -0800 (PST)

Just out of curiosity, does your system have IDE or SCSI as the
dump device?

--Matt

On Fri, 9 Nov 2001, Nava Navaruparajah wrote:
|>Hi,
|>
|>   I tried latest LKCD 4.0 patched onto 2.4.8 kernel. However the
|>system hangs while dump being generated. After system crash,
|>it says "writing Header pages .....". Then hangs while doing
|>"writing pages..... "..
|>
|>My setup:
|>dual pentium processore system. On top of RedHat 7.1, I have
|>linux 2.4.8.
|>
|>Am I missing anything?
|>
|>Thanks,
|>nava
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From yakker@aparity.com Wed Nov 14 10:58:53 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1645FH-0000In-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 10:58:47 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAEIudj15397;
	Wed, 14 Nov 2001 10:56:39 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Nava Navaruparajah <nava@core.rose.hp.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] FAQ is missing
In-Reply-To: <200111080246.SAA12523@core.rose.hp.com>
Message-ID: <Pine.LNX.4.30.0111141056080.15261-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 10:59:06 2001
X-Original-Date: Wed, 14 Nov 2001 10:56:39 -0800 (PST)

Working on it -- it was sorely out of date, so I didn't
incorporate it to the new site yet.  I can put the old one
up, but again, it's got a lot of older information in it.

--Matt

On Wed, 7 Nov 2001, Nava Navaruparajah wrote:
|>Hi,
|>
|>  I want to try the latest lkcd - 4.0. However, the FAQ is
|>missing in http://lkcd.sourceforge.net/faq.html
|>
|>  Can someone point me the right FAQ for getting started with
|>the latest lkcd.
|>
|>thanks,
|>nava
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From ctindel@falcon.csc.calpoly.edu Wed Nov 14 16:25:01 2001
Received: from falcon.csc.calpoly.edu ([129.65.242.5])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164AKy-0007iw-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 16:25:00 -0800
Received: from hornet.csc.calpoly.edu (hornet.csc.calpoly.edu [129.65.242.4])
	by falcon.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAF0Om602266
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 16:24:49 -0800 (PST)
Received: from localhost (ctindel@localhost)
	by hornet.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAF0OnB08995
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 16:24:49 -0800 (PST)
From: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
To: <lkcd-general@lists.sourceforge.net>
Message-ID: <Pine.GSO.4.33.0111141616470.7526-100000@hornet.csc.calpoly.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [lkcd-general] LKCD 3.1.3
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 16:25:04 2001
X-Original-Date: Wed, 14 Nov 2001 16:24:49 -0800 (PST)

Hi all-

I have a need to use LKCD with 2.4.2 kernel (on RH 7.1) so I'm running the
3.1.3 version that is up on the web.

For some reason, all I get in my dumps are the header (and BTW, the panic
string is corrupted because the string in the header structure is sized at
0x100, but kernel/panic.c has a buffer 1024 bytes long).  I don't get any
pages dumped out.

I have the panic level set to 4, so I should be getting complete memory
dumps.  Is there anything else I need to do?

Also, when will the 4.0 version be upgraded to a later kernel, like 2.4.15?
I'm assuming you guys were just waiting for the VM stuff to stabilize and
for Alan and Linus to decide what happened, but now that a decision has been
made, what is your plan for porting forward?

Here's my /etc/sysconfig/vmdump:

DUMP_ACTIVE=1
DUMPDEV=/dev/sdb7
DUMPDIR=/home/vmdump
DUMP_SAVE=1
DUMP_LEVEL=4
DUMP_COMPRESS_PAGES=1
PANIC_TIMEOUT=5

And my fdisk output:

[root@itchy sysconfig]# fdisk -l /dev/sdb

Disk /dev/sdb: 64 heads, 32 sectors, 8678 cylinders
Units = cylinders of 2048 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
   /dev/sdb1             1       100    102384   83  Linux
   /dev/sdb2           101      1001    922624   83  Linux
   /dev/sdb3          1002      8678   7861248    5  Extended
   /dev/sdb5          1002      4002   3073008   83  Linux
   /dev/sdb6          4003      7003   3073008   83  Linux
   /dev/sdb7          7004      8678   1715184   82  Linux swap

And my /proc/meminfo:

[root@itchy sysconfig]# cat /proc/meminfo
        total:    used:    free:  shared: buffers:  cached:
		Mem:  2108710912 53956608 2054754304        0  2551808 27136000
		Swap: 4066312192        0 4066312192
		MemTotal:      2059288 kB
		MemFree:       2006596 kB
		MemShared:           0 kB
		Buffers:          2492 kB
		Cached:          26500 kB
		Active:          10492 kB
		Inact_dirty:     18500 kB
		Inact_clean:         0 kB
		Inact_target:      108 kB
		HighTotal:     1179584 kB
		HighFree:      1144492 kB
		LowTotal:       879704 kB
		LowFree:        862104 kB
		SwapTotal:     3971008 kB
		SwapFree:      3971008 kB

So it should be creating 2 GB worth of pages, but compressed enough to fit
onto 1.7 GB of swap.

Any advice?

Thanks,

Chad



From yakker@alacritech.com Wed Nov 14 19:50:24 2001
Received: from smtp.alacritech.com ([209.10.208.82])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164DXj-0006KB-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 19:50:23 -0800
Received: from alacritech.com ([10.1.10.36])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fAF3mSK18310;
	Wed, 14 Nov 2001 19:48:28 -0800
Message-ID: <3BF33A6D.CED5D8A6@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
CC: lkcd-general@lists.sourceforge.net
Subject: Re: [lkcd-general] LKCD 3.1.3
References: <Pine.GSO.4.33.0111141616470.7526-100000@hornet.csc.calpoly.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 19:51:02 2001
X-Original-Date: Wed, 14 Nov 2001 19:45:49 -0800

"Chad N. Tindel" wrote:
> 
> Hi all-
> 
> I have a need to use LKCD with 2.4.2 kernel (on RH 7.1) so I'm running the
> 3.1.3 version that is up on the web.

4.0 is on the web (lkcd.sourceforge.net) ...

> For some reason, all I get in my dumps are the header (and BTW, the panic
> string is corrupted because the string in the header structure is sized at
> 0x100, but kernel/panic.c has a buffer 1024 bytes long).  I don't get any
> pages dumped out.

This issue should be completely resolved in 4.0.  There was a
problem with the 3.1.3 dumps where dumping may not complete
due to interrupt state.

> I have the panic level set to 4, so I should be getting complete memory
> dumps.  Is there anything else I need to do?
> 
> Also, when will the 4.0 version be upgraded to a later kernel, like 2.4.15?
> I'm assuming you guys were just waiting for the VM stuff to stabilize and
> for Alan and Linus to decide what happened, but now that a decision has been
> made, what is your plan for porting forward?

Hmmm, 4.0 should patch fairly closely into 2.4.15.  I haven't pushed
it up yet, as I'm trying to get 4.0.1 tested.  If you're using the
latest files from the CVS tree, most of the patch should apply pretty
well.

> Here's my /etc/sysconfig/vmdump:
> 
> DUMP_ACTIVE=1
> DUMPDEV=/dev/sdb7
> DUMPDIR=/home/vmdump
> DUMP_SAVE=1
> DUMP_LEVEL=4
> DUMP_COMPRESS_PAGES=1
> PANIC_TIMEOUT=5
> 
> And my fdisk output:
> 
> [root@itchy sysconfig]# fdisk -l /dev/sdb
> 
> Disk /dev/sdb: 64 heads, 32 sectors, 8678 cylinders
> Units = cylinders of 2048 * 512 bytes
> 
>    Device Boot    Start       End    Blocks   Id  System
>    /dev/sdb1             1       100    102384   83  Linux
>    /dev/sdb2           101      1001    922624   83  Linux
>    /dev/sdb3          1002      8678   7861248    5  Extended
>    /dev/sdb5          1002      4002   3073008   83  Linux
>    /dev/sdb6          4003      7003   3073008   83  Linux
>    /dev/sdb7          7004      8678   1715184   82  Linux swap
> 
> And my /proc/meminfo:
> 
> [root@itchy sysconfig]# cat /proc/meminfo
>         total:    used:    free:  shared: buffers:  cached:
>                 Mem:  2108710912 53956608 2054754304        0  2551808 27136000
>                 Swap: 4066312192        0 4066312192
>                 MemTotal:      2059288 kB
>                 MemFree:       2006596 kB
>                 MemShared:           0 kB
>                 Buffers:          2492 kB
>                 Cached:          26500 kB
>                 Active:          10492 kB
>                 Inact_dirty:     18500 kB
>                 Inact_clean:         0 kB
>                 Inact_target:      108 kB
>                 HighTotal:     1179584 kB
>                 HighFree:      1144492 kB
>                 LowTotal:       879704 kB
>                 LowFree:        862104 kB
>                 SwapTotal:     3971008 kB
>                 SwapFree:      3971008 kB
> 
> So it should be creating 2 GB worth of pages, but compressed enough to fit
> onto 1.7 GB of swap.

If you use gzip compression, you should be able to save some of that
extra disk space.

> Any advice?

Again, I'd load up 4.0, or take the files directly from the
CVS tree.  I'll try to test 2.4.15, but I have to get these
RH 7.1/7.2 trees finished first.

> Thanks,
> 
> Chad

Never enough time in the day ... :)

--Matt


From ctindel@falcon.csc.calpoly.edu Wed Nov 14 22:41:52 2001
Received: from falcon.csc.calpoly.edu ([129.65.242.5])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164GDg-0004OV-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 14 Nov 2001 22:41:52 -0800
Received: from hornet.csc.calpoly.edu (hornet.csc.calpoly.edu [129.65.242.4])
	by falcon.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAF6ff600420;
	Wed, 14 Nov 2001 22:41:41 -0800 (PST)
Received: from localhost (ctindel@localhost)
	by hornet.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAF6fg026890;
	Wed, 14 Nov 2001 22:41:42 -0800 (PST)
From: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
To: "Matt D. Robinson" <yakker@alacritech.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3Z
In-Reply-To: <3BF33A6D.CED5D8A6@alacritech.com>
Message-ID: <Pine.GSO.4.33.0111142238380.26523-100000@hornet.csc.calpoly.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 14 22:42:03 2001
X-Original-Date: Wed, 14 Nov 2001 22:41:42 -0800 (PST)

> "Chad N. Tindel" wrote:
> >
> > Hi all-
> >
> > I have a need to use LKCD with 2.4.2 kernel (on RH 7.1) so I'm running the
> > 3.1.3 version that is up on the web.
>
> 4.0 is on the web (lkcd.sourceforge.net) ...

How do I get a patch for 4.0 on 2.4.2?  All I saw was 2.4.8...

> > For some reason, all I get in my dumps are the header (and BTW, the panic
> > string is corrupted because the string in the header structure is sized at
> > 0x100, but kernel/panic.c has a buffer 1024 bytes long).  I don't get any
> > pages dumped out.
>
> This issue should be completely resolved in 4.0.  There was a
> problem with the 3.1.3 dumps where dumping may not complete
> due to interrupt state.

True.  4.0 works fine with 2.4.8.  I just *have* to use 2.4.2 since that's
what ships on redhat 7.1, and I don't really care what version that is.

> Again, I'd load up 4.0, or take the files directly from the
> CVS tree.  I'll try to test 2.4.15, but I have to get these
> RH 7.1/7.2 trees finished first.

> Never enough time in the day ... :)

I heard that.  I'm trying to get some bonding changes finalized over in that
other part of the kernel.  :)

Chad




From yakker@alacritech.com Thu Nov 15 01:57:59 2001
Received: from smtp.alacritech.com ([209.10.208.82])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164JHS-0001d6-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 15 Nov 2001 01:57:58 -0800
Received: from alacritech.com ([10.1.10.36])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fAF9u4K21336
	for <lkcd-general@lists.sourceforge.net>; Thu, 15 Nov 2001 01:56:04 -0800
Message-ID: <3BF39094.BB08667C@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: lkcd-general@lists.sourceforge.net
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: [lkcd-general] LKCD experimental directory available
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 15 01:58:02 2001
X-Original-Date: Thu, 15 Nov 2001 01:53:24 -0800

So, given that I have only so many hours in the day in which
to do testing of any given piece of code, and knowing that any
and all changes for other people's code requires lots and lots
of additional testing, and given that multiple releases are now
being requested, plus working on my own code ...

I've created an "experimental" directory on the download site
on lkcd.sourceforge.net for those developers and bleeding edge
testers who want the latest release of code, but in patch/RPM
form.  I'd normally just release this against a single version,
but I'm getting lots of RedHat requests, which, while reasonable,
are numerous, and I don't have time to test each release.

So here's the deal.

I'll put out the patches, RPMs, etc., but I need some help
testing these.   If you would be interested in testing LKCD for
new releases, I can certainly provide information how to do just
that.  I just can't cover 2.4.2-2, 2.4.3-12, 2.4.9-12, etc.,
etc., etc., much less keep up with the Linus/Alan fiasco.

In the experimental directory right now is the 4.0.1 release.
The differences between 4.0 and 4.0.1 are:

- Patch from Suparna to handle SCSI/SMP interrupt cases, and
  add in an additional bit of code to deal with wakeup on
  kiobufs when dumps are configured;

- Patch from Kapish to deal with dump/highmem pages having an
  invalid dp_address assigned to the page dump headers;

- Patch for reported problem from Alex Aminoff to deal with
  commented-out swap partitions in /etc/fstab being linked as
  the primary dump device;

- Small clean-ups with zlib.h check-in, spec file update, etc.
  Full gzip compression included in the patch;

Please let me know if I've totally broken the RH 7.1 patch, or
if the linux-2.4.8 patch has a problem.  If things work, though,
I'll post the rest of the RedHat patches for 7.1 and 7.2
(including updates) for everyone.  That means 2.4.2-2, 2.4.3-12,
2.4.7-10, 2.4.9-12, and 2.4.9-13 (whew!)

All the files are in:

	lkcd.sourceforge.net/download/experimental/<release>

Thanks, any and all help is appreciated.

--Matt


From yakker@alacritech.com Thu Nov 15 01:59:21 2001
Received: from smtp.alacritech.com ([209.10.208.82])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164JIm-0002EP-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 15 Nov 2001 01:59:21 -0800
Received: from alacritech.com ([10.1.10.36])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fAF9vRK21361;
	Thu, 15 Nov 2001 01:57:27 -0800
Message-ID: <3BF390E6.9BB970C@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
CC: lkcd-general@lists.sourceforge.net
Subject: Re: [lkcd-general] LKCD 3.1.3Z
References: <Pine.GSO.4.33.0111142238380.26523-100000@hornet.csc.calpoly.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 15 02:00:02 2001
X-Original-Date: Thu, 15 Nov 2001 01:54:46 -0800

Chad, you're more than welcome to test the experimental stuff on
SourceForge (in the download directory) for 4.0.1, which includes
the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
it, let me know, and I'll try to fix whatever bug you may see.

--Matt

"Chad N. Tindel" wrote:
> 
> > "Chad N. Tindel" wrote:
> > >
> > > Hi all-
> > >
> > > I have a need to use LKCD with 2.4.2 kernel (on RH 7.1) so I'm running the
> > > 3.1.3 version that is up on the web.
> >
> > 4.0 is on the web (lkcd.sourceforge.net) ...
> 
> How do I get a patch for 4.0 on 2.4.2?  All I saw was 2.4.8...
> 
> > > For some reason, all I get in my dumps are the header (and BTW, the panic
> > > string is corrupted because the string in the header structure is sized at
> > > 0x100, but kernel/panic.c has a buffer 1024 bytes long).  I don't get any
> > > pages dumped out.
> >
> > This issue should be completely resolved in 4.0.  There was a
> > problem with the 3.1.3 dumps where dumping may not complete
> > due to interrupt state.
> 
> True.  4.0 works fine with 2.4.8.  I just *have* to use 2.4.2 since that's
> what ships on redhat 7.1, and I don't really care what version that is.
> 
> > Again, I'd load up 4.0, or take the files directly from the
> > CVS tree.  I'll try to test 2.4.15, but I have to get these
> > RH 7.1/7.2 trees finished first.
> 
> > Never enough time in the day ... :)
> 
> I heard that.  I'm trying to get some bonding changes finalized over in that
> other part of the kernel.  :)
> 
> Chad
> 
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general


From ctindel@falcon.csc.calpoly.edu Thu Nov 15 11:31:17 2001
Received: from falcon.csc.calpoly.edu ([129.65.242.5])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164SEH-0005vV-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 15 Nov 2001 11:31:17 -0800
Received: from hornet.csc.calpoly.edu (hornet.csc.calpoly.edu [129.65.242.4])
	by falcon.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAFJV8613533;
	Thu, 15 Nov 2001 11:31:08 -0800 (PST)
Received: from localhost (ctindel@localhost)
	by hornet.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAFJV6Z11094;
	Thu, 15 Nov 2001 11:31:06 -0800 (PST)
From: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
To: "Matt D. Robinson" <yakker@alacritech.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3Z
In-Reply-To: <3BF390E6.9BB970C@alacritech.com>
Message-ID: <Pine.GSO.4.33.0111151128350.11064-100000@hornet.csc.calpoly.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 15 11:32:01 2001
X-Original-Date: Thu, 15 Nov 2001 11:31:06 -0800 (PST)

> Chad, you're more than welcome to test the experimental stuff on
> SourceForge (in the download directory) for 4.0.1, which includes
> the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
> it, let me know, and I'll try to fix whatever bug you may see.

I can't get it to build on redhat 2.4.2 kernel.  I get the following build
error:

make[3]: Entering directory `/usr/src/linux-2.4.2/drivers/dump'
gcc -D__KERNEL__ -I/usr/src/linux-2.4.2/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -Wno-unused -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o dump_base.o dump_base.c
dump_base.c: In function `dump_kernel_write':
dump_base.c:390: structure has no member named `blocks'
dump_base.c: In function `dump_add_page':
dump_base.c:527: `addr' undeclared (first use in this function)
dump_base.c:527: (Each undeclared identifier is reported only once
dump_base.c:527: for each function it appears in.)

I'm assuming that on 527 addr should be vaddr.  Is this correct?

As for the 390 error, i'm not sure what to do about that... I just don't
know that much about the VM area of the kernel.

Chad




From ctindel@falcon.csc.calpoly.edu Thu Nov 15 11:43:50 2001
Received: from falcon.csc.calpoly.edu ([129.65.242.5])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164SQQ-00017V-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 15 Nov 2001 11:43:50 -0800
Received: from hornet.csc.calpoly.edu (hornet.csc.calpoly.edu [129.65.242.4])
	by falcon.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAFJhg614050;
	Thu, 15 Nov 2001 11:43:42 -0800 (PST)
Received: from localhost (ctindel@localhost)
	by hornet.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAFJhee11240;
	Thu, 15 Nov 2001 11:43:40 -0800 (PST)
From: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
To: "Matt D. Robinson" <yakker@alacritech.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3
In-Reply-To: <3BF390E6.9BB970C@alacritech.com>
Message-ID: <Pine.GSO.4.33.0111151140340.11064-100000@hornet.csc.calpoly.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 15 11:44:03 2001
X-Original-Date: Thu, 15 Nov 2001 11:43:40 -0800 (PST)

> Chad, you're more than welcome to test the experimental stuff on
> SourceForge (in the download directory) for 4.0.1, which includes
> the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
> it, let me know, and I'll try to fix whatever bug you may see.

Just out of pure curiousity, is it possible to get information out of the
3.1.3 dump header like the system process list and the kernel stack trace for
each CPU?  I mean, my dump header is 796 KB, so it must be storing something
in there other than the panic string and System information.

I would be OK with the 3.1.3 at dump level=1 if that information was
available.

Thanks,

Chad



From yakker@alacritech.com Thu Nov 15 11:51:55 2001
Received: from smtp.alacritech.com ([209.10.208.82])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164SYF-000341-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 15 Nov 2001 11:51:55 -0800
Received: from alacritech.com (lambda.alacritech.com [10.1.1.32])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fAFJnxK28348;
	Thu, 15 Nov 2001 11:49:59 -0800
Message-ID: <3BF42C5B.E18DE3D5@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.2-2smp i686)
X-Accept-Language: en
MIME-Version: 1.0
To: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
CC: lkcd-general@lists.sourceforge.net
Subject: Re: [lkcd-general] LKCD 3.1.3Z
References: <Pine.GSO.4.33.0111151128350.11064-100000@hornet.csc.calpoly.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 15 11:52:03 2001
X-Original-Date: Thu, 15 Nov 2001 12:58:03 -0800

Fixing them now ... I'll let you know when I have a build patch
available.  I'm awake again so I can re-start the build (I saw
the same thing).

--Matt

"Chad N. Tindel" wrote:
> 
> > Chad, you're more than welcome to test the experimental stuff on
> > SourceForge (in the download directory) for 4.0.1, which includes
> > the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
> > it, let me know, and I'll try to fix whatever bug you may see.
> 
> I can't get it to build on redhat 2.4.2 kernel.  I get the following build
> error:
> 
> make[3]: Entering directory `/usr/src/linux-2.4.2/drivers/dump'
> gcc -D__KERNEL__ -I/usr/src/linux-2.4.2/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -Wno-unused -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o dump_base.o dump_base.c
> dump_base.c: In function `dump_kernel_write':
> dump_base.c:390: structure has no member named `blocks'
> dump_base.c: In function `dump_add_page':
> dump_base.c:527: `addr' undeclared (first use in this function)
> dump_base.c:527: (Each undeclared identifier is reported only once
> dump_base.c:527: for each function it appears in.)
> 
> I'm assuming that on 527 addr should be vaddr.  Is this correct?
> 
> As for the 390 error, i'm not sure what to do about that... I just don't
> know that much about the VM area of the kernel.
> 
> Chad


From yakker@aparity.com Fri Nov 16 02:11:40 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164fyA-0003JL-00
	for <lkcd-general@lists.sourceforge.net>; Fri, 16 Nov 2001 02:11:34 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAGAFYO27400;
	Fri, 16 Nov 2001 02:15:35 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
cc: "Matt D. Robinson" <yakker@alacritech.com>,
   <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3
In-Reply-To: <Pine.GSO.4.33.0111151128350.11064-100000@hornet.csc.calpoly.edu>
Message-ID: <Pine.LNX.4.30.0111160213220.26869-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Fri Nov 16 02:12:03 2001
X-Original-Date: Fri, 16 Nov 2001 02:15:34 -0800 (PST)

The build errors should now be fixed, I have to crash, but if I get
a chance tomorrow morning I'll boot on a new kernel and run a few
crash tests.  Feel free to grab the replacement patch.

The fixes to the code put back in the addr unsigned long, since we
use it for gzip compression page checks (really, it's for all types
of compression, but gzip's really the one that needs it).  I have
also put some KERNEL_VERSION() tags around the blocks[] field with
the kiobuf.  That field was added with 2.4.4 onward.  I also cleaned
up the include/asm links left in the kernel-source RPM. :)

Thanks, Chad.

--Matt

On Thu, 15 Nov 2001, Chad N. Tindel wrote:
|>> Chad, you're more than welcome to test the experimental stuff on
|>> SourceForge (in the download directory) for 4.0.1, which includes
|>> the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
|>> it, let me know, and I'll try to fix whatever bug you may see.
|>
|>I can't get it to build on redhat 2.4.2 kernel.  I get the following build
|>error:
|>
|>make[3]: Entering directory `/usr/src/linux-2.4.2/drivers/dump'
|>gcc -D__KERNEL__ -I/usr/src/linux-2.4.2/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -Wno-unused -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o dump_base.o dump_base.c
|>dump_base.c: In function `dump_kernel_write':
|>dump_base.c:390: structure has no member named `blocks'
|>dump_base.c: In function `dump_add_page':
|>dump_base.c:527: `addr' undeclared (first use in this function)
|>dump_base.c:527: (Each undeclared identifier is reported only once
|>dump_base.c:527: for each function it appears in.)
|>
|>I'm assuming that on 527 addr should be vaddr.  Is this correct?
|>
|>As for the 390 error, i'm not sure what to do about that... I just don't
|>know that much about the VM area of the kernel.
|>
|>Chad
|>
|>
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From yakker@alacritech.com Fri Nov 16 02:22:16 2001
Received: from smtp.alacritech.com ([209.10.208.82])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164g8W-0005P9-00
	for <lkcd-general@lists.sourceforge.net>; Fri, 16 Nov 2001 02:22:16 -0800
Received: from alacritech.com ([10.1.10.35])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fAGAKEK05262;
	Fri, 16 Nov 2001 02:20:14 -0800
Message-ID: <3BF4E7B9.A517F162@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Matt D. Robinson" <yakker@aparity.com>
CC: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>,
   lkcd-general@lists.sourceforge.net
Subject: Re: [lkcd-general] LKCD 3.1.3
References: <Pine.LNX.4.30.0111160213220.26869-100000@nakedeye.aparity.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Fri Nov 16 02:23:02 2001
X-Original-Date: Fri, 16 Nov 2001 02:17:29 -0800

Well, there's still an issue with irq_affinity[].  I'll fix that
and get the next version out.

--Matt

"Matt D. Robinson" wrote:
> 
> The build errors should now be fixed, I have to crash, but if I get
> a chance tomorrow morning I'll boot on a new kernel and run a few
> crash tests.  Feel free to grab the replacement patch.
> 
> The fixes to the code put back in the addr unsigned long, since we
> use it for gzip compression page checks (really, it's for all types
> of compression, but gzip's really the one that needs it).  I have
> also put some KERNEL_VERSION() tags around the blocks[] field with
> the kiobuf.  That field was added with 2.4.4 onward.  I also cleaned
> up the include/asm links left in the kernel-source RPM. :)
> 
> Thanks, Chad.
> 
> --Matt
> 
> On Thu, 15 Nov 2001, Chad N. Tindel wrote:
> |>> Chad, you're more than welcome to test the experimental stuff on
> |>> SourceForge (in the download directory) for 4.0.1, which includes
> |>> the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
> |>> it, let me know, and I'll try to fix whatever bug you may see.
> |>
> |>I can't get it to build on redhat 2.4.2 kernel.  I get the following build
> |>error:
> |>
> |>make[3]: Entering directory `/usr/src/linux-2.4.2/drivers/dump'
> |>gcc -D__KERNEL__ -I/usr/src/linux-2.4.2/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -Wno-unused -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o dump_base.o dump_base.c
> |>dump_base.c: In function `dump_kernel_write':
> |>dump_base.c:390: structure has no member named `blocks'
> |>dump_base.c: In function `dump_add_page':
> |>dump_base.c:527: `addr' undeclared (first use in this function)
> |>dump_base.c:527: (Each undeclared identifier is reported only once
> |>dump_base.c:527: for each function it appears in.)
> |>
> |>I'm assuming that on 527 addr should be vaddr.  Is this correct?
> |>
> |>As for the 390 error, i'm not sure what to do about that... I just don't
> |>know that much about the VM area of the kernel.
> |>
> |>Chad
> |>
> |>
> |>
> |>_______________________________________________
> |>Lkcd-general mailing list
> |>Lkcd-general@lists.sourceforge.net
> |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> |>
> 
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general


From yakker@alacritech.com Fri Nov 16 02:33:37 2001
Received: from smtp.alacritech.com ([209.10.208.82])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 164gJU-0006rf-00
	for <lkcd-general@lists.sourceforge.net>; Fri, 16 Nov 2001 02:33:36 -0800
Received: from alacritech.com ([10.1.10.35])
	by smtp.alacritech.com (8.11.2/8.11.2) with ESMTP id fAGAVZK05340;
	Fri, 16 Nov 2001 02:31:35 -0800
Message-ID: <3BF4EA61.2C9E3F2D@alacritech.com>
From: "Matt D. Robinson" <yakker@alacritech.com>
Organization: Alacritech, Inc.
X-Mailer: Mozilla 4.78 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Matt D. Robinson" <yakker@aparity.com>,
   "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>,
   lkcd-general@lists.sourceforge.net
Subject: Re: [lkcd-general] LKCD 3.1.3
References: <Pine.LNX.4.30.0111160213220.26869-100000@nakedeye.aparity.com> <3BF4E7B9.A517F162@alacritech.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Fri Nov 16 02:34:02 2001
X-Original-Date: Fri, 16 Nov 2001 02:28:49 -0800

Okay, fixed again.  I unfortunately have to change the irq.c file
in the arch directory directly.  I could clean this up another way,
but this works well for now.

--Matt

"Matt D. Robinson" wrote:
> 
> Well, there's still an issue with irq_affinity[].  I'll fix that
> and get the next version out.
> 
> --Matt
> 
> "Matt D. Robinson" wrote:
> >
> > The build errors should now be fixed, I have to crash, but if I get
> > a chance tomorrow morning I'll boot on a new kernel and run a few
> > crash tests.  Feel free to grab the replacement patch.
> >
> > The fixes to the code put back in the addr unsigned long, since we
> > use it for gzip compression page checks (really, it's for all types
> > of compression, but gzip's really the one that needs it).  I have
> > also put some KERNEL_VERSION() tags around the blocks[] field with
> > the kiobuf.  That field was added with 2.4.4 onward.  I also cleaned
> > up the include/asm links left in the kernel-source RPM. :)
> >
> > Thanks, Chad.
> >
> > --Matt
> >
> > On Thu, 15 Nov 2001, Chad N. Tindel wrote:
> > |>> Chad, you're more than welcome to test the experimental stuff on
> > |>> SourceForge (in the download directory) for 4.0.1, which includes
> > |>> the RedHat 7.1 2.4.2-2 kernel patch.  If you see any problems with
> > |>> it, let me know, and I'll try to fix whatever bug you may see.
> > |>
> > |>I can't get it to build on redhat 2.4.2 kernel.  I get the following build
> > |>error:
> > |>
> > |>make[3]: Entering directory `/usr/src/linux-2.4.2/drivers/dump'
> > |>gcc -D__KERNEL__ -I/usr/src/linux-2.4.2/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -fno-common -Wno-unused -pipe -mpreferred-stack-boundary=2 -march=i686    -c -o dump_base.o dump_base.c
> > |>dump_base.c: In function `dump_kernel_write':
> > |>dump_base.c:390: structure has no member named `blocks'
> > |>dump_base.c: In function `dump_add_page':
> > |>dump_base.c:527: `addr' undeclared (first use in this function)
> > |>dump_base.c:527: (Each undeclared identifier is reported only once
> > |>dump_base.c:527: for each function it appears in.)
> > |>
> > |>I'm assuming that on 527 addr should be vaddr.  Is this correct?
> > |>
> > |>As for the 390 error, i'm not sure what to do about that... I just don't
> > |>know that much about the VM area of the kernel.
> > |>
> > |>Chad
> > |>
> > |>
> > |>
> > |>_______________________________________________
> > |>Lkcd-general mailing list
> > |>Lkcd-general@lists.sourceforge.net
> > |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> > |>
> >
> > _______________________________________________
> > Lkcd-general mailing list
> > Lkcd-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/lkcd-general
> 
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general


From wgu@eesn3.ews.uiuc.edu Mon Nov 19 10:49:23 2001
Received: from eesn3.ews.uiuc.edu ([130.126.161.187])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165tTt-0005iG-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 10:49:21 -0800
Received: from localhost (wgu@localhost)
	by eesn3.ews.uiuc.edu (8.11.0/8.10.1) with ESMTP id fAJInBt17186;
	Mon, 19 Nov 2001 12:49:11 -0600 (CST)
From: gu weining <wgu@ews.uiuc.edu>
To: lkcd-general@lists.sourceforge.net
cc: wgu@ews.uiuc.edu
Message-ID: <Pine.GSO.4.21.0111191140270.17008-100000@eesn3.ews.uiuc.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [lkcd-general] Some problems about using LKCD
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 10:50:02 2001
X-Original-Date: Mon, 19 Nov 2001 12:49:11 -0600 (CST)

Hi friends,

I met quite a few problems when I tried to install and run
lkcd 4.0 under Linux 2.4.8 this past week. Would you please 
help me figure it out at your earliest time? 

1. No 'lkcd config' or 'lkcd save' can be found in 
   /etc/rc.d/rc.sysinit script. Do you mean we need to 
   insert these into "rc.sysinit" file(say, before
   # Start up swapping.) by ourselves?  

2. No "Kerntypes" in the /boot directory. I copied it
   from /usr/src/linux-2.4.8(my top src dir) . Right?

3. Seems lkcd only can handle something like "panic".
   How about other crash such as "system HANG"? 
   For example, "Unable to handle kernel NULL pointer
   dereference"(in arch/i386/mm/fault.c)" will cause 
   hang, can you handle it? There are hundreds of errors
   will cause system hang. I assume you can solve these.

   I notice you modify die() in "arch/i386/kernel/traps.c"
   and add "dump((char *)str, regs);". Sounds like it's
   not enough(at least to me), and cannot cover lots of
   "Unable to..." errors which will cause kernel hang.

4. Where to get linux 2.4.2-2 source(not 2.4.2)? Since 
   new "experimental LKCD 4.0.1"'s patch only support 
   2.4.2-2, I don't know where to get it.

BTW, my /etc/sysconfig/dump is as follows:

DUMP_ACTIVE=1
DUMPDEV=/dev/vmdump
DUMPDIR=/var/log/dump
DUMP_SAVE=1
DUMP_LEVEL=2
DUMP_FLAGS=0
DUMP_COMPRESS=0
PANIC_TIMEOUT=5

Thank you so much. I do hope to use lkcd as my major
tool to support my work. From my point of view, well
documented is much more important than developing 
new version.

Have a nice Thanksgiving?

Weining Gu




From ctindel@falcon.csc.calpoly.edu Mon Nov 19 12:44:51 2001
Received: from falcon.csc.calpoly.edu ([129.65.242.5])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165vHf-0000lU-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 12:44:51 -0800
Received: from hornet.csc.calpoly.edu (hornet.csc.calpoly.edu [129.65.242.4])
	by falcon.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAJKid603785;
	Mon, 19 Nov 2001 12:44:39 -0800 (PST)
Received: from localhost (ctindel@localhost)
	by hornet.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAJKicv13610;
	Mon, 19 Nov 2001 12:44:39 -0800 (PST)
From: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
To: "Matt D. Robinson" <yakker@aparity.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3
In-Reply-To: <Pine.LNX.4.30.0111160213220.26869-100000@nakedeye.aparity.com>
Message-ID: <Pine.GSO.4.33.0111191241200.13541-100000@hornet.csc.calpoly.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 12:45:03 2001
X-Original-Date: Mon, 19 Nov 2001 12:44:38 -0800 (PST)

> The build errors should now be fixed, I have to crash, but if I get
> a chance tomorrow morning I'll boot on a new kernel and run a few
> crash tests.  Feel free to grab the replacement patch.

Hi Matt-

There used to be a utility called vmdump that I could run "vmdump config"
with to setup the dumping stuff.  The lkcdutils RPM now has an
/etc/sysconfig/dump instead of /etc/sysconfig/vmdump, but I can't find any
way to make lkcd_config use that file.  I have to actually specify the flags
on the command line.  If this facility has indeed been lost I can probably
modify lkcd_config and send you the patch... I just didn't want to duplicate
work.

Chad



From tjm@sgi.com Mon Nov 19 14:08:24 2001
Received: from rj.sgi.com ([204.94.215.100])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165waW-0004ll-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 14:08:24 -0800
Received: from nodin.corp.sgi.com (nodin.corp.sgi.com [192.26.51.193])
	by rj.sgi.com (8.11.4/8.11.4/linux-outbound_gateway-1.1) with ESMTP id fAJM8MY22549
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 14:08:22 -0800
Received: from loco.csd.sgi.com (loco.csd.sgi.com [130.62.73.130])
	by nodin.corp.sgi.com (8.11.4/8.11.2/nodin-1.0) with ESMTP id fAJM8M413107240;
	Mon, 19 Nov 2001 14:08:22 -0800 (PST)
Received: from striker (mtv-vpn-hw-tjm-2.corp.sgi.com [134.15.18.147]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id OAA07974; Mon, 19 Nov 2001 14:04:10 -0800 (PST)
Message-ID: <019c01c17148$3404ac30$93120f86@corp.sgi.com>
From: "Tom Morano" <tjm@sgi.com>
To: <lkcd-general@lists.sourceforge.net>
Cc: "Tom Morano" <tjm@sgi.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Subject: [lkcd-general] Changes to lkcdutils to address SGI's snia, discontiguous memory, etc.
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 14:09:03 2001
X-Original-Date: Mon, 19 Nov 2001 14:19:06 -0800

I've made a number of changes to various lkcdutils modules to address
architecture specific issues pertaining to SGI's snia platform. There
were two issues in particular that precipitated the need for change:

o On multi-node systems, where each node has its own physical memory 
  installed, it's possible (probable) for large holes in the physical
  memory space to exist. There needs to be a way to identify which
  physical memory addresses are valid and which are not.

o It's possible that, with certain configurations, virtual addresses
  of global text and data objects may not map directly to the physical 
  address space. It's also possible, on some systems, physical memory 
  will not start at address 0x0. 

The current location of such functions as kl_virtop() did not make 
it easy to implement a solution for these architecture specific
problems. As a result, I have moved the kl_virtop() function and
several other functions into the architecture specific portion of 
libklib.  This will allow architecture specific differences in 
physical memory alignment to be addressed more easily. Specifically, 
here is an overview of the changes I made:

o Moved a number of functions from the generic portion of libklib
  to libklib/arch/*/kl_kern.c :

  - kl_virtop() from libklib/kl_mem.c 
  - kl_init_vtop() and kl_init_high_memory() from libklib/kl_memory.c
  - kl_kernelstack() from likbli/kl_task.c 

o Replaced the kl_init_kern_info() function with kl_arch_init()
  and moved where this function was called from (it had been from 
  lcrash/lkcd_ksyms -- I moved it to libklib/klib.c). From this 
  point on, place any architecture specific initialization calls 
  in the kl_arch_init() function.

o Added functions for mapping valid physical memory addresses at
  libklib initialization (snia specific).

o Added functions for determining if a physical address is valid 
  and for determining the next valid physical address (mainly 
  to prevent problems when doing a live dump or trying to access 
  non-existent but legal physical memory addresses).

I have tested out the changes on ia64 and i386 systems, and things 
appear to work just fine. I do not have access to an alpha or 
s390 system for testing, but the changes for these architectures 
were minimal (mainly just moving the code around). It would be 
nice if someone with access to these platforms could check this 
out though.

I've included below, a list of the modified source files. I'm 
assuming Matt, that these changes will get rolled into the 
new experimental release?

Feedback and comments, of course, are welcome.

Thanks,

Tom

===================================================================
lcrash/main.c
libklib/kl_mem.c
libklib/kl_memory.c
libklib/kl_task.c
libklib/klib.c
libklib/arch/alpha/kl_kern.c
libklib/arch/alpha/kl_page.c
libklib/arch/i386/kl_kern.c
libklib/arch/i386/kl_page.c
libklib/arch/ia64/kl_kern.c
libklib/arch/ia64/kl_page.c
libklib/arch/s390/kl_kern.c
libklib/arch/s390/kl_page.c
libklib/arch/s390x/kl_kern.c
libklib/arch/s390x/kl_page.c
libklib/include/kl_mem.h
libklib/include/asm-i386/kl_arch.h
libklib/include/asm-ia64/kl_arch.h
lkcdutils/lkcd_config/lkcd_config.c
lkcdutils/lkcd_ksyms/main.c

Consolidated all architecture specific initialization functionality 
(virtop, kernelstack, system memory mapping, etc.) into the 
architecture specific directories.

===================================================================
lcrash/struct.c

Temporary snia fix (to prevent endless loop of task command).


===================================================================
lcrash/vmdump.c

Add support for live dump on systems with DISCONTIG memory..


===================================================================
libklib/kl_symbol.c

Removed all references to locore text addresses (caused lcrash
initialization problems and was no longer needed).




From yakker@aparity.com Mon Nov 19 17:18:56 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165zYo-0004OG-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 17:18:51 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAK1MqR07391;
	Mon, 19 Nov 2001 17:22:53 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3
In-Reply-To: <Pine.GSO.4.33.0111191241200.13541-100000@hornet.csc.calpoly.edu>
Message-ID: <Pine.LNX.4.30.0111191721400.7388-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 17:19:15 2001
X-Original-Date: Mon, 19 Nov 2001 17:22:52 -0800 (PST)

It's now /sbin/lkcd config, instead of /sbin/vmdump config.  We
moved away from the 'vmdump' name to just 'dump', and in places
where lkcd was appropriate, we used that name.  It's in the latest
lkcdutils RPM in the experimental directory.

Let's hope it doesn't change again ... :)

--Matt

On Mon, 19 Nov 2001, Chad N. Tindel wrote:
|>> The build errors should now be fixed, I have to crash, but if I get
|>> a chance tomorrow morning I'll boot on a new kernel and run a few
|>> crash tests.  Feel free to grab the replacement patch.
|>
|>Hi Matt-
|>
|>There used to be a utility called vmdump that I could run "vmdump config"
|>with to setup the dumping stuff.  The lkcdutils RPM now has an
|>/etc/sysconfig/dump instead of /etc/sysconfig/vmdump, but I can't find any
|>way to make lkcd_config use that file.  I have to actually specify the flags
|>on the command line.  If this facility has indeed been lost I can probably
|>modify lkcd_config and send you the patch... I just didn't want to duplicate
|>work.
|>
|>Chad
|>
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From yakker@aparity.com Mon Nov 19 17:20:25 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165zaF-0004ev-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 17:20:19 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAK1OSE07403;
	Mon, 19 Nov 2001 17:24:28 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Tom Morano <tjm@sgi.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] Changes to lkcdutils to address SGI's snia,
 discontiguous memory, etc.
In-Reply-To: <019c01c17148$3404ac30$93120f86@corp.sgi.com>
Message-ID: <Pine.LNX.4.30.0111191723370.7388-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 17:21:03 2001
X-Original-Date: Mon, 19 Nov 2001 17:24:28 -0800 (PST)

On Mon, 19 Nov 2001, Tom Morano wrote:
|>I have tested out the changes on ia64 and i386 systems, and things
|>appear to work just fine. I do not have access to an alpha or
|>s390 system for testing, but the changes for these architectures
|>were minimal (mainly just moving the code around). It would be
|>nice if someone with access to these platforms could check this
|>out though.
|>
|>I've included below, a list of the modified source files. I'm
|>assuming Matt, that these changes will get rolled into the
|>new experimental release?

Yes, I'll roll a new lkcdutils RPM after a cvs update.  Let me
get on that (on vacation right now, so I don't have a ton of
time ... :D)

--Matt



From ctindel@falcon.csc.calpoly.edu Mon Nov 19 17:21:22 2001
Received: from falcon.csc.calpoly.edu ([129.65.242.5])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165zbF-0004nE-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 17:21:21 -0800
Received: from hornet.csc.calpoly.edu (hornet.csc.calpoly.edu [129.65.242.4])
	by falcon.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAK1L8625674;
	Mon, 19 Nov 2001 17:21:08 -0800 (PST)
Received: from localhost (ctindel@localhost)
	by hornet.csc.calpoly.edu (8.10.2+Sun/8.10.2) with ESMTP id fAK1L8X18818;
	Mon, 19 Nov 2001 17:21:08 -0800 (PST)
From: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
To: "Matt D. Robinson" <yakker@aparity.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3
In-Reply-To: <Pine.LNX.4.30.0111191721400.7388-100000@nakedeye.aparity.com>
Message-ID: <Pine.GSO.4.33.0111191719580.17244-100000@hornet.csc.calpoly.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 17:22:04 2001
X-Original-Date: Mon, 19 Nov 2001 17:21:08 -0800 (PST)

> It's now /sbin/lkcd config, instead of /sbin/vmdump config.  We
> moved away from the 'vmdump' name to just 'dump', and in places
> where lkcd was appropriate, we used that name.  It's in the latest
> lkcdutils RPM in the experimental directory.
>
> Let's hope it doesn't change again ... :)

Thanks for the info.  Everything seems to be working fine, except it doesn't
seem to always reboot 5 seconds after the core is done being written as is
specified in the dump config.  Have you seen this problem on other machines
too?

Chad




From yakker@aparity.com Mon Nov 19 17:43:22 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 165zwS-00012L-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 19 Nov 2001 17:43:16 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAK1lGS08504;
	Mon, 19 Nov 2001 17:47:17 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Chad N. Tindel" <ctindel@falcon.csc.calpoly.edu>
cc: <tjm@sgi.com>, <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] LKCD 3.1.3
In-Reply-To: <Pine.GSO.4.33.0111191719580.17244-100000@hornet.csc.calpoly.edu>
Message-ID: <Pine.LNX.4.30.0111191745250.8493-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Nov 19 17:44:02 2001
X-Original-Date: Mon, 19 Nov 2001 17:47:16 -0800 (PST)

I haven't.  If /sbin/lkcd config is run, it should specify '5'
as the /proc/sys/kernel/panic value (or whatever you've set in
/etc/sysconfig/dump).  If you run lkcd config and it doesn't show
the right value, let me know.  That would be a bug.

--Matt

P.S.  I've rolled a new 4.0.1-2 lkcdutils RPM.  Let me know if it
      has any problems.  This has all of Tom's stuff in it.

On Mon, 19 Nov 2001, Chad N. Tindel wrote:
|>> It's now /sbin/lkcd config, instead of /sbin/vmdump config.  We
|>> moved away from the 'vmdump' name to just 'dump', and in places
|>> where lkcd was appropriate, we used that name.  It's in the latest
|>> lkcdutils RPM in the experimental directory.
|>>
|>> Let's hope it doesn't change again ... :)
|>
|>Thanks for the info.  Everything seems to be working fine, except it doesn't
|>seem to always reboot 5 seconds after the core is done being written as is
|>specified in the dump config.  Have you seen this problem on other machines
|>too?
|>
|>Chad



From barada@zambeel.com Tue Nov 20 15:23:27 2001
Received: from imrelay.zambeel.com ([63.89.188.9])
	by usw-sf-list1.sourceforge.net with esmtp 
	(Cipher TLSv1:DES-CBC3-SHA:168) (Exim 3.31-VA-mm2 #1 (Debian))
	id 166KEg-00068n-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 20 Nov 2001 15:23:26 -0800
Received: from xchange.zambeel.com (exchange [63.89.188.10])
	by imrelay.zambeel.com (8.11.0/8.11.0) with ESMTP id fAKNXC504941
	for <lkcd-general@lists.sourceforge.net>; Tue, 20 Nov 2001 15:33:12 -0800
Received: by exchange.zambeel.com with Internet Mail Service (5.5.2653.19)
	id <WZ3W34SH>; Tue, 20 Nov 2001 15:23:20 -0800
Received: from zambeel.com (barada-linux.zambeel.com [10.0.10.80]) by xchange.zambeel.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13)
	id WZ3W34SG; Tue, 20 Nov 2001 15:23:18 -0800
From: Barada Mishra <barada@Zambeel.com>
To: lkcd-general@lists.sourceforge.net
Message-ID: <3BFAE5E6.B2D81BCE@zambeel.com>
X-Mailer: Mozilla 4.72 [en] (X11; U; Linux 2.2.16-zambeel i686)
X-Accept-Language: en
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: [lkcd-general] lkcd patch for 2.4.13+
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 20 15:24:13 2001
X-Original-Date: Tue, 20 Nov 2001 15:23:18 -0800

Hello,

Is there a lkcd patch available for 2.4.13 or later releases?
lkcd.sourceforge.net has the latest patch for 2.4.8.
Can I get an experimental patch from somebody?
If not, when can I expect one? Or any help converting the 2.4.8 patch to
2.4.13 patch will be highly appreciated.

Thanks for your help,
Barada




From Valerie.Carr@unisys.com Tue Nov 20 16:11:34 2001
Received: from eamail1-out.unisys.com ([192.61.61.99])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 166KzF-0003QP-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 20 Nov 2001 16:11:33 -0800
Received: from us-ea-gtwy-6.ea.unisys.com (us-ea-gtwy-6.ea.unisys.com [192.61.146.102])
	by eamail1-out.unisys.com (8.9.3/8.9.3) with ESMTP id AAA27352
	for <lkcd-general@lists.sourceforge.net>; Wed, 21 Nov 2001 00:09:19 GMT
Received: by us-ea-gtwy-6.ea.unisys.com with Internet Mail Service (5.5.2653.19)
	id <W0K2ZDLR>; Tue, 20 Nov 2001 18:11:31 -0600
Message-ID: <245F259ABD41D511A07000D0B71C4CBA1E1322@us-slc-exch-3.slc.unisys.com>
From: "Carr, Valerie" <Valerie.Carr@UNISYS.com>
To: "'lkcd-general@lists.sourceforge.net'"
	 <lkcd-general@lists.sourceforge.net>
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Subject: [lkcd-general] Compiling lkcd on ia64
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 20 16:12:13 2001
X-Original-Date: Tue, 20 Nov 2001 18:11:25 -0600

Hi

I am having trouble compiling lkcd on an ia64 system.  I've downloaded a
2.4.8 kernel and I've installed the latest 4.0 patch.  I then try and
compile and I get the following error:

ld -static -T arch/ia64/vmlinux.lds arch/ia64/kernel/head.o
arch/ia64/kernel/ini
t_task.o init/main.o init/version.o \
        --start-group \
        arch/ia64/kernel/kernel.o arch/ia64/mm/mm.o arch/ia64/ia32/ia32.o
arch/i
a64/dig/dig.a kernel/kernel.o mm/mm.o fs/fs.o ipc/ipc.o \
         drivers/acpi/acpi.o drivers/char/char.o drivers/block/block.o
drivers/m
isc/misc.o drivers/net/net.o drivers/media/media.o drivers/char/agp/agp.o
driver
s/char/drm-4.0/drm.o drivers/ide/idedriver.o drivers/dump/dump.o
drivers/scsi/sc
sidrv.o drivers/cdrom/driver.o drivers/sound/sounddrivers.o
drivers/pci/driver.o
 drivers/video/video.o drivers/usb/usbdrv.o drivers/input/inputdrv.o \
        net/network.o \
        /source/lsrc/248/linux/linux/arch/ia64/lib/lib.a
/source/lsrc/248/linux/
linux/lib/lib.a /source/lsrc/248/linux/linux/arch/ia64/lib/lib.a \
        --end-group \
        -o vmlinux
kernel/kernel.o:/source/lsrc/248/linux/linux/kernel/sched.c:534: undefined
refer
ence to `page_is_ram'
make: *** [vmlinux] Error 1

I cannot see where this error is coming from, I'm thinking that there is an
error somewhere else, not in sched.c.
I was under the impression that lkcd is supposed to work with ia64, but I
have yet to get it working correctly.  Am I wrong? 
Any help would be appreciated.

Thanks,
Valerie



From simon.falvey@veritas.com Tue Nov 20 23:55:31 2001
Received: from bay-bridge.veritas.com ([143.127.3.10] helo=eritas.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 166SEF-0007WW-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 20 Nov 2001 23:55:31 -0800
Received: from veritas.com (sfalvey-lx [127.0.0.1])
	by eritas.com (8.11.6/8.11.2) with ESMTP id fAL7roP02175
	for <lkcd-general@lists.sourceforge.net>; Wed, 21 Nov 2001 07:53:51 GMT
Message-ID: <3BFB5D8D.2020007@veritas.com>
From: Simon Falvey <simon.falvey@veritas.com>
Organization: Veritas Software
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.5) Gecko/20011012
X-Accept-Language: en-gb, en-us
MIME-Version: 1.0
To: lkcd-general@lists.sourceforge.net
Subject: Re: [lkcd-general] lkcd patch for 2.4.13+
References: <3BFAE5E6.B2D81BCE@zambeel.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 20 23:56:06 2001
X-Original-Date: Wed, 21 Nov 2001 07:53:49 +0000

Barada Mishra wrote:

>Hello,
>
>Is there a lkcd patch available for 2.4.13 or later releases?
>lkcd.sourceforge.net has the latest patch for 2.4.8.
>Can I get an experimental patch from somebody?
>If not, when can I expect one? Or any help converting the 2.4.8 patch to
>2.4.13 patch will be highly appreciated.
>
>Thanks for your help,
>Barada
>
>
>
>_______________________________________________
>Lkcd-general mailing list
>Lkcd-general@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/lkcd-general
>
Or even for 2.4.9? The latest RH release uses 2.4.9 which has a lot has 
changed code around the whole magic sysrq sequence so stitching in the 
2.4.8 patch is very time consuming. If someone is already working on the 
patch that would be cool.

Cheers

Simon


-- 
Simon Falvey
Product Specialist, VERITAS Technical Services
VERITAS Software Corporation
Reading, United Kingdom
Direct: +44 (0) 118 918 8105
simon.falvey@veritas.com





From yakker@aparity.com Wed Nov 21 19:05:56 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 166kBS-0004MR-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 21 Nov 2001 19:05:50 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAM39oZ10518;
	Wed, 21 Nov 2001 19:09:50 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Simon Falvey <simon.falvey@veritas.com>
cc: <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] lkcd patch for 2.4.13+
In-Reply-To: <3BFB5D8D.2020007@veritas.com>
Message-ID: <Pine.LNX.4.30.0111211907400.10468-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 21 19:06:02 2001
X-Original-Date: Wed, 21 Nov 2001 19:09:49 -0800 (PST)

I'm working on the latest RH 7.2 patch, and I'll port to 2.4.15
after that.  I've been on vacation, and without real network
access, I can't get as much done. :)

I will try to have it done on Sunday, but I can't promise it.
Next week for sure, unless someone else has already done it.

Any thoughts on the latest 4.0.1 stuff?  Is it working properly?
If so, I'd like to snap and release ...

--Matt

On Wed, 21 Nov 2001, Simon Falvey wrote:
|>Barada Mishra wrote:
|>
|>>Hello,
|>>
|>>Is there a lkcd patch available for 2.4.13 or later releases?
|>>lkcd.sourceforge.net has the latest patch for 2.4.8.
|>>Can I get an experimental patch from somebody?
|>>If not, when can I expect one? Or any help converting the 2.4.8 patch to
|>>2.4.13 patch will be highly appreciated.
|>>
|>>Thanks for your help,
|>>Barada
|>>
|>>
|>>
|>>_______________________________________________
|>>Lkcd-general mailing list
|>>Lkcd-general@lists.sourceforge.net
|>>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>>
|>Or even for 2.4.9? The latest RH release uses 2.4.9 which has a lot has
|>changed code around the whole magic sysrq sequence so stitching in the
|>2.4.8 patch is very time consuming. If someone is already working on the
|>patch that would be cool.
|>
|>Cheers
|>
|>Simon



From yakker@aparity.com Wed Nov 21 19:10:26 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 166kFn-0004nI-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 21 Nov 2001 19:10:20 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAM3ERC10552;
	Wed, 21 Nov 2001 19:14:27 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Carr, Valerie" <Valerie.Carr@UNISYS.com>
cc: "'lkcd-general@lists.sourceforge.net'" <lkcd-general@lists.sourceforge.net>
Subject: Re: [lkcd-general] Compiling lkcd on ia64
In-Reply-To: <245F259ABD41D511A07000D0B71C4CBA1E1322@us-slc-exch-3.slc.unisys.com>
Message-ID: <Pine.LNX.4.30.0111211909580.10468-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 21 19:11:03 2001
X-Original-Date: Wed, 21 Nov 2001 19:14:27 -0800 (PST)

I believe that the problem is in the kernel/ksyms.c file.
On line 360, you'll see:

EXPORT_SYMBOL(page_is_ram);

Simply add the following around it:

#if defined(CONFIG_X86) || defined(CONFIG_ALPHA)
EXPORT_SYMBOL(page_is_ram);
#endif

Let me know if this corrects your problem, and I'll check
the fix into the source tree.

--Matt

On Tue, 20 Nov 2001, Carr, Valerie wrote:
|>Hi
|>
|>I am having trouble compiling lkcd on an ia64 system.  I've downloaded a
|>2.4.8 kernel and I've installed the latest 4.0 patch.  I then try and
|>compile and I get the following error:
|>
|>ld -static -T arch/ia64/vmlinux.lds arch/ia64/kernel/head.o
|>arch/ia64/kernel/ini
|>t_task.o init/main.o init/version.o \
|>        --start-group \
|>        arch/ia64/kernel/kernel.o arch/ia64/mm/mm.o arch/ia64/ia32/ia32.o
|>arch/i
|>a64/dig/dig.a kernel/kernel.o mm/mm.o fs/fs.o ipc/ipc.o \
|>         drivers/acpi/acpi.o drivers/char/char.o drivers/block/block.o
|>drivers/m
|>isc/misc.o drivers/net/net.o drivers/media/media.o drivers/char/agp/agp.o
|>driver
|>s/char/drm-4.0/drm.o drivers/ide/idedriver.o drivers/dump/dump.o
|>drivers/scsi/sc
|>sidrv.o drivers/cdrom/driver.o drivers/sound/sounddrivers.o
|>drivers/pci/driver.o
|> drivers/video/video.o drivers/usb/usbdrv.o drivers/input/inputdrv.o \
|>        net/network.o \
|>        /source/lsrc/248/linux/linux/arch/ia64/lib/lib.a
|>/source/lsrc/248/linux/
|>linux/lib/lib.a /source/lsrc/248/linux/linux/arch/ia64/lib/lib.a \
|>        --end-group \
|>        -o vmlinux
|>kernel/kernel.o:/source/lsrc/248/linux/linux/kernel/sched.c:534: undefined
|>refer
|>ence to `page_is_ram'
|>make: *** [vmlinux] Error 1
|>
|>I cannot see where this error is coming from, I'm thinking that there is an
|>error somewhere else, not in sched.c.
|>I was under the impression that lkcd is supposed to work with ia64, but I
|>have yet to get it working correctly.  Am I wrong?
|>Any help would be appreciated.
|>
|>Thanks,
|>Valerie
|>
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From lkcd-general-owner@lists.sourceforge.net Thu Nov 22 10:53:40 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 166yyg-0005cV-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 22 Nov 2001 10:53:38 -0800
Received: from d12lmsgate-2.de.ibm.com (d12lmsgate-2.de.ibm.com [195.212.91.200])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAMJrbo30247
	for <lkcd@oss.sgi.com>; Thu, 22 Nov 2001 11:53:37 -0800
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23])
	by d12lmsgate-2.de.ibm.com (1.0.0) with ESMTP id TAA27184
	for <lkcd@oss.sgi.com>; Thu, 22 Nov 2001 19:53:30 +0100
Received: from d12ml033.de.ibm.com (d12ml033_cs0 [9.165.223.11])
	by d12relay02.de.ibm.com (8.11.1m3/NCO v4.98) with ESMTP id fAMIreR36986
	for <lkcd@oss.sgi.com>; Thu, 22 Nov 2001 19:53:40 +0100
Subject: [lkcd-general] dump and highmem
To: lkcd@oss.sgi.com
X-Mailer: Lotus Notes Release 5.0.4a  July 24, 2000
Message-ID: <OF140A55BB.3EC52BF0-ONC1256B0C.0062AEB0@de.ibm.com>
From: "Andreas Herrmann" <AHERRMAN@de.ibm.com>
X-MIMETrack: Serialize by Router on D12ML033/12/M/IBM(Release 5.0.8 |June 18, 2001) at
 22/11/2001 19:53:41
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 22 10:54:02 2001
X-Original-Date: Thu, 22 Nov 2001 19:53:39 +0100

Hi,

I tried out the current cvs version of lcrash. And oops, lcrash fails while
reading its own dumps, generated with
lcrash's livedump command. I observed this on i386.

Details: When initializing KL_HIGH_MEMORY, lcrash fails. Setting
cmp_debug=1, I received following output:

__cmppread(): initiating search for 0x2424a8
__cmppindex(): hash =   8228, addr = 0x2424a8
__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0xc0236000
__cmppread(): page not found! (0x2424a8)

Using cvs version 1.8 of file libklib/kl_cmp.c lcrash works fine. The
corresponding output is:

__cmppread(): initiating search for 0x2424a8
__cmppindex(): hash =   8228, addr = 0x2424a8
__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0x242000
__cmppread(): found the page in the page index!
0x242000: 1247 -> 4096 COMPRESSED, writing 4096 bytes
__cmppinsert(): Malloc occurred! [0]
__cmppinsert(): Inserting page into cache! (0x2424a8) [0]...

Probably the hash values were mixed up ...

As I found out, the error is caused by Kapish's  patch, which was checked
in on 11/14/2001.(See attached mails.)
lcrash works fine with respect to livedumps when using version 1.8 of file
libklib/kl_cmp.c
which contains

paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;

instead of

paddr = (kaddr_t)dp->dp_address;

in function __cmpconvertaddr().


Maybe someone has time to rework Kapish's patch to be "compatible" with
livedumps, too?

Regards,

Andreas

--
Linux for eServer Development
Tel :  +49-7031-16-4640
Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
email :  aherrman@de.ibm.com

----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57 PM -----
|--------+---------------------------------------->
|        |          Kapish K <kapish@ureach.com>  |
|        |          Sent by:                      |
|        |          lkcd-general-admin@lists.sourc|
|        |          eforge.net                    |
|        |                                        |
|        |                                        |
|        |          11/13/01 06:28 AM             |
|        |          Please respond to kapish      |
|        |                                        |
|--------+---------------------------------------->
  >-----------------------------------------------------------------------------------------------------|
  |                                                                                                     |
  |      To:     lkcd@oss.sgi.com                                                                       |
  |      cc:                                                                                            |
  |      Subject:     [lkcd-general] dump and highmem                                                   |
  |                                                                                                     |
  |                                                                                                     |
  >-----------------------------------------------------------------------------------------------------|



Hello,
           While trying to use lkcd and lcrash ( 4.0 ) on dumps from
highmem enbaled boxes, a colleague noticed what might be a bug
in the lkcd code.
The error seems to occur when lcrash looks at the headrrs of
loaded modules in the dump file, one of which is mapped at the
highmemory region of physical memory.
The dp_address field ( in add_dump_page ) is the virtual address
( obtained frpm page_address(p) which gets the page->vitual
address ), but for pages in highmemory, this would be zero
unless the page was kampped at that point in time during dump.
Is that right?
When lcrash starts, it seems to build an index of physical
pages, and
it uses the dp_address fields to determine the real memory
address by subtracting the kernel page offset (usually
0xC0000000).  Thus, the real memory address of the high memory
pages seem to be incorrect.
We fixed this by changing the code so that dp_address is a
real memory address rather than virtual.
so, the changes we did were the following:
in dump_base.c:
--- drivers/dump/dump_base.c.orig         Wed Nov  7 02:54:53 2001
+++ drivers/dump/dump_base.c        Fri Nov  9 02:24:05 2001
@@ -486,13 +486,12 @@
 #if defined(CONFIG_X86) || defined(CONFIG_ALPHA)
           extern int page_is_ram(unsigned long);
 #endif
-          unsigned long addr, size;
+          unsigned long size;
           dump_page_t dp;
           struct page *p = (struct page *)&(mem_map[mem_loc]);
           void *vaddr;

-          addr = (unsigned long)page_address(p);
-          dp.dp_address = (uint64_t)addr;
+          dp.dp_address = (uint64_t)mem_loc << PAGE_SHIFT;
           dp.dp_flags = DUMP_DH_RAW;

           /*
in lkcdutils-1.0-7.src.rpm:

--- libklib/kl_cmp.c.orig           Sat Jun 16 22:50:10 2001
+++ libklib/kl_cmp.c           Thu Nov  8 18:11:43 2001
@@ -920,7 +920,7 @@
 {
           kaddr_t paddr;

-          paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
+          paddr = (kaddr_t)dp->dp_address;
           return (paddr);
 }

This seems to fix our problems with being able to look at trace
records for pages in high memory.
What I am looking for is whether this has already been
identified by the lckd team as a problem, and if so, has the fix
you plan the same? if this is not a problem, what have we missd
in here? And finally, if this is a problem and the the solution
is acceptable, could this change get into lkcd?
TIA

________________________________________________
Get your own "800" number
Voicemail, fax, email, and a lot more
http://www.ureach.com/reg/tag

_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general

----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57 PM -----
|--------+---------------------------------------->
|        |          "Matt D. Robinson"            |
|        |          <yakker@alacritech.com>       |
|        |          Sent by:                      |
|        |          lkcd-general-admin@lists.sourc|
|        |          eforge.net                    |
|        |                                        |
|        |                                        |
|        |          11/15/01 10:53 AM             |
|        |          Please respond to "Matt D.    |
|        |          Robinson"                     |
|        |                                        |
|--------+---------------------------------------->
  >-----------------------------------------------------------------------------------------------------|
  |                                                                                                     |
  |      To:     lkcd-general@lists.sourceforge.net                                                     |
  |      cc:                                                                                            |
  |      Subject:     [lkcd-general] LKCD experimental directory available                              |
  |                                                                                                     |
  |                                                                                                     |
  >-----------------------------------------------------------------------------------------------------|



So, given that I have only so many hours in the day in which
to do testing of any given piece of code, and knowing that any
and all changes for other people's code requires lots and lots
of additional testing, and given that multiple releases are now
being requested, plus working on my own code ...

I've created an "experimental" directory on the download site
on lkcd.sourceforge.net for those developers and bleeding edge
testers who want the latest release of code, but in patch/RPM
form.  I'd normally just release this against a single version,
but I'm getting lots of RedHat requests, which, while reasonable,
are numerous, and I don't have time to test each release.

So here's the deal.

I'll put out the patches, RPMs, etc., but I need some help
testing these.   If you would be interested in testing LKCD for
new releases, I can certainly provide information how to do just
that.  I just can't cover 2.4.2-2, 2.4.3-12, 2.4.9-12, etc.,
etc., etc., much less keep up with the Linus/Alan fiasco.

In the experimental directory right now is the 4.0.1 release.
The differences between 4.0 and 4.0.1 are:

- Patch from Suparna to handle SCSI/SMP interrupt cases, and
  add in an additional bit of code to deal with wakeup on
  kiobufs when dumps are configured;

- Patch from Kapish to deal with dump/highmem pages having an
  invalid dp_address assigned to the page dump headers;

- Patch for reported problem from Alex Aminoff to deal with
  commented-out swap partitions in /etc/fstab being linked as
  the primary dump device;

- Small clean-ups with zlib.h check-in, spec file update, etc.
  Full gzip compression included in the patch;

Please let me know if I've totally broken the RH 7.1 patch, or
if the linux-2.4.8 patch has a problem.  If things work, though,
I'll post the rest of the RedHat patches for 7.1 and 7.2
(including updates) for everyone.  That means 2.4.2-2, 2.4.3-12,
2.4.7-10, 2.4.9-12, and 2.4.9-13 (whew!)

All the files are in:

           lkcd.sourceforge.net/download/experimental/<release>

Thanks, any and all help is appreciated.

--Matt

_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general





From lkcd-general-owner@lists.sourceforge.net Thu Nov 22 15:12:27 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 167311-0003b6-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 22 Nov 2001 15:12:20 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAN0CGo09514
	for <lkcd@oss.sgi.com>; Thu, 22 Nov 2001 16:12:18 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAMNGPY11425;
	Thu, 22 Nov 2001 15:16:26 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Andreas Herrmann <AHERRMAN@de.ibm.com>
cc: <lkcd@oss.sgi.com>
Subject: Re: [lkcd-general] dump and highmem
In-Reply-To: <OF140A55BB.3EC52BF0-ONC1256B0C.0062AEB0@de.ibm.com>
Message-ID: <Pine.LNX.4.30.0111221514040.11360-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 22 15:13:02 2001
X-Original-Date: Thu, 22 Nov 2001 15:16:25 -0800 (PST)

Hi, Andreas.  I think it would be better if we fixed the
livedump function to correct the dp_address mechanism to
move it more in-line with the kernel code.  We could fix
this in 'lcrash' itself, and differentiate the read, but
we can fix it in the vmdump.c file and make sure it works
for all cases.

Let me fix this when I get back (Saturday).  Should be as
simple as referencing mem_loc << lkcdinfo.page_shift into
the dp_address.

--Matt

On Thu, 22 Nov 2001, Andreas Herrmann wrote:
|>Hi,
|>
|>I tried out the current cvs version of lcrash. And oops, lcrash fails while
|>reading its own dumps, generated with
|>lcrash's livedump command. I observed this on i386.
|>
|>Details: When initializing KL_HIGH_MEMORY, lcrash fails. Setting
|>cmp_debug=1, I received following output:
|>
|>__cmppread(): initiating search for 0x2424a8
|>__cmppindex(): hash =   8228, addr = 0x2424a8
|>__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0xc0236000
|>__cmppread(): page not found! (0x2424a8)
|>
|>Using cvs version 1.8 of file libklib/kl_cmp.c lcrash works fine. The
|>corresponding output is:
|>
|>__cmppread(): initiating search for 0x2424a8
|>__cmppindex(): hash =   8228, addr = 0x2424a8
|>__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0x242000
|>__cmppread(): found the page in the page index!
|>0x242000: 1247 -> 4096 COMPRESSED, writing 4096 bytes
|>__cmppinsert(): Malloc occurred! [0]
|>__cmppinsert(): Inserting page into cache! (0x2424a8) [0]...
|>
|>Probably the hash values were mixed up ...
|>
|>As I found out, the error is caused by Kapish's  patch, which was checked
|>in on 11/14/2001.(See attached mails.)
|>lcrash works fine with respect to livedumps when using version 1.8 of file
|>libklib/kl_cmp.c
|>which contains
|>
|>paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
|>
|>instead of
|>
|>paddr = (kaddr_t)dp->dp_address;
|>
|>in function __cmpconvertaddr().
|>
|>
|>Maybe someone has time to rework Kapish's patch to be "compatible" with
|>livedumps, too?
|>
|>Regards,
|>
|>Andreas
|>
|>--
|>Linux for eServer Development
|>Tel :  +49-7031-16-4640
|>Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
|>email :  aherrman@de.ibm.com
|>
|>----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57 PM -----
|>|--------+---------------------------------------->
|>|        |          Kapish K <kapish@ureach.com>  |
|>|        |          Sent by:                      |
|>|        |          lkcd-general-admin@lists.sourc|
|>|        |          eforge.net                    |
|>|        |                                        |
|>|        |                                        |
|>|        |          11/13/01 06:28 AM             |
|>|        |          Please respond to kapish      |
|>|        |                                        |
|>|--------+---------------------------------------->
|>  >-----------------------------------------------------------------------------------------------------|
|>  |                                                                                                     |
|>  |      To:     lkcd@oss.sgi.com                                                                       |
|>  |      cc:                                                                                            |
|>  |      Subject:     [lkcd-general] dump and highmem                                                   |
|>  |                                                                                                     |
|>  |                                                                                                     |
|>  >-----------------------------------------------------------------------------------------------------|
|>
|>
|>
|>Hello,
|>           While trying to use lkcd and lcrash ( 4.0 ) on dumps from
|>highmem enbaled boxes, a colleague noticed what might be a bug
|>in the lkcd code.
|>The error seems to occur when lcrash looks at the headrrs of
|>loaded modules in the dump file, one of which is mapped at the
|>highmemory region of physical memory.
|>The dp_address field ( in add_dump_page ) is the virtual address
|>( obtained frpm page_address(p) which gets the page->vitual
|>address ), but for pages in highmemory, this would be zero
|>unless the page was kampped at that point in time during dump.
|>Is that right?
|>When lcrash starts, it seems to build an index of physical
|>pages, and
|>it uses the dp_address fields to determine the real memory
|>address by subtracting the kernel page offset (usually
|>0xC0000000).  Thus, the real memory address of the high memory
|>pages seem to be incorrect.
|>We fixed this by changing the code so that dp_address is a
|>real memory address rather than virtual.
|>so, the changes we did were the following:
|>in dump_base.c:
|>--- drivers/dump/dump_base.c.orig         Wed Nov  7 02:54:53 2001
|>+++ drivers/dump/dump_base.c        Fri Nov  9 02:24:05 2001
|>@@ -486,13 +486,12 @@
|> #if defined(CONFIG_X86) || defined(CONFIG_ALPHA)
|>           extern int page_is_ram(unsigned long);
|> #endif
|>-          unsigned long addr, size;
|>+          unsigned long size;
|>           dump_page_t dp;
|>           struct page *p = (struct page *)&(mem_map[mem_loc]);
|>           void *vaddr;
|>
|>-          addr = (unsigned long)page_address(p);
|>-          dp.dp_address = (uint64_t)addr;
|>+          dp.dp_address = (uint64_t)mem_loc << PAGE_SHIFT;
|>           dp.dp_flags = DUMP_DH_RAW;
|>
|>           /*
|>in lkcdutils-1.0-7.src.rpm:
|>
|>--- libklib/kl_cmp.c.orig           Sat Jun 16 22:50:10 2001
|>+++ libklib/kl_cmp.c           Thu Nov  8 18:11:43 2001
|>@@ -920,7 +920,7 @@
|> {
|>           kaddr_t paddr;
|>
|>-          paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
|>+          paddr = (kaddr_t)dp->dp_address;
|>           return (paddr);
|> }
|>
|>This seems to fix our problems with being able to look at trace
|>records for pages in high memory.
|>What I am looking for is whether this has already been
|>identified by the lckd team as a problem, and if so, has the fix
|>you plan the same? if this is not a problem, what have we missd
|>in here? And finally, if this is a problem and the the solution
|>is acceptable, could this change get into lkcd?
|>TIA
|>
|>________________________________________________
|>Get your own "800" number
|>Voicemail, fax, email, and a lot more
|>http://www.ureach.com/reg/tag
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>
|>----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57 PM -----
|>|--------+---------------------------------------->
|>|        |          "Matt D. Robinson"            |
|>|        |          <yakker@alacritech.com>       |
|>|        |          Sent by:                      |
|>|        |          lkcd-general-admin@lists.sourc|
|>|        |          eforge.net                    |
|>|        |                                        |
|>|        |                                        |
|>|        |          11/15/01 10:53 AM             |
|>|        |          Please respond to "Matt D.    |
|>|        |          Robinson"                     |
|>|        |                                        |
|>|--------+---------------------------------------->
|>  >-----------------------------------------------------------------------------------------------------|
|>  |                                                                                                     |
|>  |      To:     lkcd-general@lists.sourceforge.net                                                     |
|>  |      cc:                                                                                            |
|>  |      Subject:     [lkcd-general] LKCD experimental directory available                              |
|>  |                                                                                                     |
|>  |                                                                                                     |
|>  >-----------------------------------------------------------------------------------------------------|
|>
|>
|>
|>So, given that I have only so many hours in the day in which
|>to do testing of any given piece of code, and knowing that any
|>and all changes for other people's code requires lots and lots
|>of additional testing, and given that multiple releases are now
|>being requested, plus working on my own code ...
|>
|>I've created an "experimental" directory on the download site
|>on lkcd.sourceforge.net for those developers and bleeding edge
|>testers who want the latest release of code, but in patch/RPM
|>form.  I'd normally just release this against a single version,
|>but I'm getting lots of RedHat requests, which, while reasonable,
|>are numerous, and I don't have time to test each release.
|>
|>So here's the deal.
|>
|>I'll put out the patches, RPMs, etc., but I need some help
|>testing these.   If you would be interested in testing LKCD for
|>new releases, I can certainly provide information how to do just
|>that.  I just can't cover 2.4.2-2, 2.4.3-12, 2.4.9-12, etc.,
|>etc., etc., much less keep up with the Linus/Alan fiasco.
|>
|>In the experimental directory right now is the 4.0.1 release.
|>The differences between 4.0 and 4.0.1 are:
|>
|>- Patch from Suparna to handle SCSI/SMP interrupt cases, and
|>  add in an additional bit of code to deal with wakeup on
|>  kiobufs when dumps are configured;
|>
|>- Patch from Kapish to deal with dump/highmem pages having an
|>  invalid dp_address assigned to the page dump headers;
|>
|>- Patch for reported problem from Alex Aminoff to deal with
|>  commented-out swap partitions in /etc/fstab being linked as
|>  the primary dump device;
|>
|>- Small clean-ups with zlib.h check-in, spec file update, etc.
|>  Full gzip compression included in the patch;
|>
|>Please let me know if I've totally broken the RH 7.1 patch, or
|>if the linux-2.4.8 patch has a problem.  If things work, though,
|>I'll post the rest of the RedHat patches for 7.1 and 7.2
|>(including updates) for everyone.  That means 2.4.2-2, 2.4.3-12,
|>2.4.7-10, 2.4.9-12, and 2.4.9-13 (whew!)
|>
|>All the files are in:
|>
|>           lkcd.sourceforge.net/download/experimental/<release>
|>
|>Thanks, any and all help is appreciated.
|>
|>--Matt
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>
|>
|>
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From naomi@pst.fujitsu.com Sun Nov 25 22:06:40 2001
Received: from fgwmail7.fujitsu.co.jp ([192.51.44.37])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 168Eud-0002lk-00
	for <lkcd-general@lists.sourceforge.net>; Sun, 25 Nov 2001 22:06:39 -0800
Received: from m4.gw.fujitsu.co.jp by fgwmail7.fujitsu.co.jp (8.9.3/3.7W-MX0110-Fujitsu Gateway)
	id PAA00161 for <lkcd-general@lists.sourceforge.net>; Mon, 26 Nov 2001 15:06:31 +0900 (JST)
	(envelope-from naomi@pst.fujitsu.com)
From: naomi@pst.fujitsu.com
Received: from naomi.aoi.pst.fujitsu.com by m4.gw.fujitsu.co.jp (8.9.3/3.7W-0111-Fujitsu Domain Master)
	id PAA15311 for <lkcd-general@lists.sourceforge.net>; Mon, 26 Nov 2001 15:06:25 +0900 (JST)
	(envelope-from naomi@pst.fujitsu.com)
Received: from localhost (IDENT:naomi@localhost [127.0.0.1])
	by naomi.aoi.pst.fujitsu.com (8.9.3/8.9.3) with ESMTP id PAA21101
	for <lkcd-general@lists.sourceforge.net>; Mon, 26 Nov 2001 15:06:40 +0900
To: lkcd-general@lists.sourceforge.net
In-Reply-To: Your message of "Tue, 04 Sep 2001 16:27:53 +0900"
	<20010904162753R.naomi@pst.fujitsu.com>
References: <20010904162753R.naomi@pst.fujitsu.com>
X-Mailer: Mew version 1.92.4 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-Id: <20011126150640K.naomi@pst.fujitsu.com>
X-Dispatcher: imput version 980905(IM100)
Lines: 1117
Subject: [lkcd-general] Re: lcrash sub-commands line completion
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Sun Nov 25 22:07:02 2001
X-Original-Date: Mon, 26 Nov 2001 15:06:40 +0900

Hi, all.

I have just implemented the mechanism of sub-commands parameters completion
(second phase) that I mentioned in the mail below.

The attached patch is against the files taken from sourceforge cvs of lkcd.  
It contains the cmd-completion.txt which describes the details of this work.  
Please see it for more information. 

Any comments and suggestions are welcomed.

Regards,
Naomi Haseo

On Tue, 04 Sep 2001, I wrote:
> Hello.
> Recently, I think that lcrash should have "sub-commands line completion".
> 
> Lcrash has many sub-commands. And almost sub-commands have parameters such as 
> filename or symbol name which should be specified.
> The present lcrash cannot complete on sub-commands line.
> For this reason, we have to memorize sub-commands names and parameters exactly.
> It is very inconvenient.
> So I'll add completion capability to librl.
> 
> I'm considering as follows.
> While editing sub-commands line, if TAB key is pressed, lcrash completes the 
> line (or do something as bash does).
> Lcrash will complete on sub-commands names with behavior almost equivalent to 
> bash.
> And I consider that parameters of sub-commands have different characteristic 
> each other, I'll add the mechanism let you be able to make your own completion 
> function. Using this mechanism, you can call the function that behaves as you
> want when TAB key is pressed.
> 
> As the first phase, I will show the completion on sub-commands names by the 
> middle of the month in September.
> And as the next phase, I will show the mechanism of sub-commands parameters
> completion with some sample source using it.
> 
> Is anybody considering sub-commands line completion?
> Any comments and suggestions are welcomed.
> 
> Naomi Haseo
> 


diff -Naur lkcdutils/lcrash/cmds/cmd-completion.txt lkcdutils+argcompl/lcrash/cmds/cmd-completion.txt
--- lkcdutils/lcrash/cmds/cmd-completion.txt	Thu Jan  1 09:00:00 1970
+++ lkcdutils+argcompl/lcrash/cmds/cmd-completion.txt	Thu Nov 22 14:54:39 2001
@@ -0,0 +1,422 @@
+
+		LCRASH sub-command line completion
+
+		Naomi Haseo (naomi@pst.fujitsu.com)
+
+		  Last Update: Nov 22, 2001
+
+0. Introduction
+
+LCRASH provides many sub-commands. And most of them have to be specified
+some parameters such as a file name or a symbol name.
+
+It's too inconvenient to remember sub-command names and their
+parameters exactly.
+So adding sub-command line completion can greatly help us investigating 
+with LCRASH.
+
+1. Overview
+
+While editing sub-command line, if TAB key is pressed, LCRASH completes the 
+line appropriately.
+
+1.1. Complete sub-command name
+
+LCRASH completes sub-command names with behavior almost equivalent to bash.
+
+- When TAB key is pressed at the head of line, LCRASH prints the list of 
+  sub-command names.
+- When TAB key is pressed in the middle of the first word of line, LCRASH
+  completes sub-command names.
+	* When there is no candidate, LCRASH prints a BEEP character.
+	* When there is only one candidate, LCRASH prints it.
+	* When there are two or more candidates,
+		+ When there is the identical string part of them, 
+		  LCRASH prints the identical part.
+		+ When there isn't the identical string part of them, 
+		  LCRASH prints the list of them.
+
+1.2. Complete sub-command's parameters
+
+Each parameter of sub-commands has different characteristic.
+So LCRASH provides facilities easily adding your own completion function.
+Please see section 2 for more detail.
+
+2. Adding your own completion function
+
+You can add your own function which completes sub-command's parameters 
+as follows:
+
+  Step1: Write your own completion function for the sub-command you selected.
+
+    You should name the function "'sub-command name'_complete", and
+    add it to the file "lcrash/cmds/cmd_'sub-command name'.c".
+
+    This function is called with the following API.
+
+      char *'sub-command name'_complete(command_t *cmd)
+
+    You can get various information with the command structure
+    pointed to by cmd.
+
+      cmd->command ... sub-command name
+      cmd->nargs   ... the number of parameters
+      cmd->args[]  ... the array of parameter names
+      cmd->ofp     ... stream to where messages are printed
+                       (stdout by default)
+
+      Note:
+        - In the args field, the args[0] points to first parameter name,
+          not to sub-command name. The args[nargs-1] points to last
+          parameter name.
+        - The last parameter name is truncated by cursor position
+          (See below examples).
+
+      Examples:
+      ------------------------------------------------------------------
+        sub-command line                fields values
+      ------------------------------------------------------------------
+      - "findsym   kernel "             command:        "findsym"
+                 ^  cursor              nargs:          1
+                 +- position            args[0]:        null
+
+      - "findsym   kernel "             command:        "findsym"
+                   ^  cursor            nargs:          1
+                   +- position          args[0]:        null
+
+      - "findsym   kernel "             command:        "findsym"
+                     ^  cursor          nargs:          1
+                     +- position        args[0]:        "ke"
+
+      - "findsym   kernel "             command:        "findsym"
+                         ^  cursor      nargs:          1
+                         +- position    args[0]:        "kernel"
+      ------------------------------------------------------------------
+
+    The function has to return one of the following values.
+
+      a) The address of string
+        The caller of this function will insert the string at cursor
+        position and redraw it without printing newline.
+        The memory region of the string has to be accessible after the
+        function returned. So it has to be allocated statically.
+        Note that the caller will not free it.
+
+      b) DRAW_NEW_ENTER_LINE
+        The caller of this function will print newline and redraw
+        sub-command line.
+
+      c) PRINT_BEEP
+        The caller of this function will print a BEEP character.
+
+    You can use the following common functions in your code.
+
+      - char *complete_standard_options(command_t *cmd);
+
+        This function completes a parameter of the standard option.
+        At this point, only '-w' option is handled.
+
+        The cmd parameter is the same as described above.
+
+        This function returns one of the following values.
+
+        a) The address of string
+          The parameter was completed and the string to be inserted is 
+          returned. This should be returned to the caller of your function
+          without any changes.
+
+        b) DRAW_NEW_ENTER_LINE
+          The parameter wasn't completed but something happened (for
+          example, printing out all candidates).
+          This should be returned to the caller of your function
+          without any changes.
+
+        c) PRINT_BEEP
+          The parameter wasn't completed but something happened (for 
+          example, printing a BEEP character).
+          This should be returned to the caller of your function
+          without any changes.
+
+        d) NOT_COMPLETED
+          The parameter wasn't completed and nothing happened.
+          Handling the completion has to be continued.
+
+      - char *complete_symbol_name(char *keystr, int print_max_candt);
+
+        This function completes the string pointed to by keystr as a
+        symbol name.
+        If there are many candidates and the number of them is more
+        than print_max_candt, it confirms whether display all of them
+        or not.
+
+        This function returns one of the following values.
+
+        a) The address of string
+          The symbol name was completed and the string to be inserted is
+          returned.
+
+        b) DRAW_NEW_ENTER_LINE
+          There were some candidates and they were printed out (or
+          confirmed to print), so a newline should be printed out.
+
+        c) PRINT_BEEP
+          There is no candidate, so a BEEP character should be
+          printed out.
+
+      - char *complete_file_name(char *string, int print_max_candt);
+
+        This function completes the string pointed to by string as a
+        file name.
+        This has the same functionality as complete_symbol_name()'s.
+
+  Step2: Register your function
+
+    In the cmdset[] of lcrash/cmds/cmds.c, find the record of the target
+    sub-command and change the initial value of the cmdcomplete field to
+    your function name.
+
+3. Adding completion ability to FINDSYM sub-command
+
+This section will show you an example of adding completion ability
+to FINDSYM sub-command.
+
+  Step1: We choose FINDSYM to add completion ability.
+         And we wrote the code of findsym_complete() and added it to
+         "lcrash/cmds/cmd_findsym.c"
+
+    The usage of FINDSYM is as follows:
+
+      findsym symname | symaddr [symname | symaddr [...] ]
+              -f string [...] 
+              [-w outfile]
+
+    We'll add completion abilities as follows:
+      - completes the parameter followed by '-w' as a file name.
+      - completes the others as a symbol name.
+
+    The function is written as follows:
+  
+    /*
+     * findsym_complete() -- Complete arguments of 'findsym' command.
+     */
+    char *
+    findsym_complete(command_t *cmd)
+    {
+      char *ret;
+      int i;
+
+      /* cmd->nargs is the number of arguments
+       * cmd->args[] is the array of the arguments
+       * so, the word to complete is cmd->args[cmd->nargs - 1]
+       * cmd->ofp is stdout
+       */
+  
+      /* first, complete the standard options (for example, -w option) 
+       * arguments by the public function.
+       */
+      if ((ret = complete_standard_options(cmd)) != NOT_COMPLETED) {
+        return(ret);
+      }
+      /* if the first character of the word is '-', 
+       * print 'findsym' usage to stdout
+       */
+      if (cmd->args[cmd->nargs - 1] != 0 
+        && cmd->args[cmd->nargs - 1][0] == '-') {
+        fprintf(stdout, "\n");
+        CMD_USAGE(cmd, _FINDSYM_USAGE);
+        return(DRAW_NEW_ENTIRE_LINE);
+      }
+      if (cmd->nargs == 1) {
+        /* if there is one argument, complete symbol name by the 
+         * public function.
+         * if the number of candidates is more 100, ask whether 
+         * display or not the list of them.
+         */
+        return(complete_symbol_name(cmd->args[0], 100));
+      } else {
+        /* if there are two or more arguments */
+        for (i = 0; i < cmd->nargs; i++) {
+          if (!strcmp(cmd->args[i], "-f")) {
+            /* don't complete the word following "-f" */
+            return(PRINT_BEEP);
+          }
+        }
+        /* complete symbol name by the public function */
+        return(complete_symbol_name(cmd->args[cmd->nargs-1], 100));
+      }
+    }
+
+  Step2: We registered findsym_complete() to cmdset[].
+	
+    findsym_complete() is defined in lcrash/cmds/cmds.c as follows:
+
+      extern char *findsym_complete(command_t *); 
+
+    The cmdcomplete field of findsym's record is changed from NULL to
+    findsym_complete as follows:
+
+      _command_t  cmdset[] = {
+		:
+        {"findsym", 0, findsym_cmd, findsym_parse, findsym_help, findsym_usage, findsym_complete},
+		:
+
+4. Design
+
+This section will describe files and data structures that need to be
+modified to implement sub-command line completion.
+
+4.1. LCRASH Design details
+
+4.1.1. Modified Files
+
+The following files require changes to implement sub-command completion:
+
+- lcrash/main.c
+- lcrash/commondefs
+- lcrash/include/command.h
+- lcrash/cmds/command.c
+- lcrash/cmds/cmds.c
+- lcrash/cmds/cmd_findsym.c
+
+4.1.2. New Files
+
+The following files will be added to implement sub-command completion:
+
+- lcrash/cmds/cmd-completion.txt
+
+4.1.3. Modified Data Structures
+
+The following existing data structures need to be altered to implement
+sub-command completion:
+
+- struct _command_t:    (lcrash/include/command.h)
+
+  The following field is added and initialized to NULL.
+
+    cmdcomplete_t   cmdcomplete; /* completion function */
+
+- struct cmd_rec_t:    (lcrash/include/command.h)
+
+  The following field is added and initialized to NULL.
+
+    cmdcomplete_t   cmdcomplete; /* completion function */
+
+4.1.4. Modified Functions
+
+The following functions require changes to implement sub-command completion:
+
+- main()    (lcrash/main.c)
+
+  Call rl_register_complete_func() to register completion function to librl.
+
+- register_cmds()    (lcrash/cmds/command.c)
+
+  Add the initialization of the cmdcomplete field in the cmd_rec_t structure.
+
+- get_cmd()    (lcrash/cmds/command.c)
+
+  Split the block of parsing sub-command line and setting up the
+  command structure into another function (line_to_words()).
+
+4.1.5. New Functions
+
+The following new functions will be added to implement sub-command completion:
+
+- void line_to_words(command_t *);    (lcrash/cmds/command.c)
+
+  Split sub-command line into sub-command name and parameters and set
+  up the command structure.
+
+- char *complete_cmds(char *, int);    (lcrash/cmds/command.c)
+
+  Call line_to_words() to parse sub-command line.
+  If cursor is on the first word
+    Call complete_subcmd_name() to complete sub-command name.
+  Else if the function which completes sub-command's parameters is registered
+    Call the function to complete sub-command's parameters.
+
+- char *complete_subcmd_name(char *);    (lcrash/cmds/command.c)
+
+  Scan 'cmd_tree' and complete sub-command name.
+
+- char *complete_standard_options(command_t *);    (lcrash/cmds/command.c)
+
+  Complete the parameter followed by '-w' as a file name.
+
+- char *complete_symbol_name(char *, int);    (lcrash/cmds/command.c)
+
+  Call kl_get_similar_name() to get the candidates for completion and
+  complete symbol name.
+
+- char *complete_file_name(char *, int);    (lcrash/cmds/command.c)
+
+  Scan file system and complete file name.
+
+- char *findsym_complete(command_t *);    (lcrash/cmds/cmd_findsym.c)
+
+  Complete the parameter followed by '-w' as a file name.
+  Complete the others as a symbol name.
+
+4.2. librl Design details
+
+4.2.1. Modified Files
+
+The following files require changes to implement sub-command completion:
+
+- librl/rl.h
+- librl/rl.c
+
+4.2.2. Modified Functions
+
+The following functions require changes to implement sub-command completion:
+	
+- getinput()    (librl/rl.c)
+
+  If TAB character is pressed, return COMPLETE_LINE.
+
+- getline()    (librl/rl.c)
+
+  If getinput() returns COMPLETE_LINE
+    Call (*rl_complete_func)().
+    According to the return code, insert the string to sub-command
+    line, print a BEEP character or redraw sub-command line with 
+    printing newline.
+
+4.2.3. New Functions
+
+The following new functions will be added to implement sub-command completion:
+
+- void rl_register_complete_func(rl_complete_func_t);    (librl/rl.c)
+
+  Register the function which completes sub-command line.
+
+4.3. libklib Design details
+
+4.3.1. Modified Files
+
+The following files require changes to implement sub-command completion:
+
+- libklib/kl_symbol.c
+- libklib/include/kl_sym.h
+
+4.3.2. Modified Data Structures
+
+The following existing data structures need to be altered to implement
+sub-command completion:
+
+- struct syment_t:    (libklib/include/kl_sym.h)
+
+  The following field is added and initialized to NULL.
+
+    struct syment_s    *s_forward; /* For linked lists */
+
+4.3.3. New Functions
+
+The following new functions will be added to implement sub-command completion:
+
+- syment_t *kl_get_similar_name(char *, char *, int *, int *);    
+  (libklib/kl_symbol.c)
+
+ Scan all symnames of each maplist and make a list of the syment
+ structure containing a match to the given name.
+
diff -Naur lkcdutils/lcrash/cmds/cmd_findsym.c lkcdutils+argcompl/lcrash/cmds/cmd_findsym.c
--- lkcdutils/lcrash/cmds/cmd_findsym.c	Tue Sep 18 10:12:31 2001
+++ lkcdutils+argcompl/lcrash/cmds/cmd_findsym.c	Mon Nov 26 14:26:39 2001
@@ -85,7 +85,7 @@
 
 
 /*
- * findsym_usage() -- Print the usage string for the 'findsy' command.
+ * findsym_usage() -- Print the usage string for the 'findsym' command.
  */
 void
 findsym_usage(command_t *cmd)
@@ -112,4 +112,39 @@
 		return(1);
 	}
 	return(0);
+}
+
+/*
+ * findsym_complete() -- Complete arguments of 'findsym' command.
+ */
+char *
+findsym_complete(command_t *cmd)
+{
+	char *ret;
+	int i;
+
+	/* first, complete the standard options (for example, -w option) arguments */
+	if ((ret = complete_standard_options(cmd)) != NOT_COMPLETED) {
+		return(ret);
+	}
+	/* if the first character of the word is '-', print 'findsym' usage */
+	if (cmd->args[cmd->nargs - 1] != 0 && cmd->args[cmd->nargs - 1][0] == '-') {
+		fprintf(stdout, "\n");
+		CMD_USAGE(cmd, _FINDSYM_USAGE);
+		return(DRAW_NEW_ENTIRE_LINE);
+	}
+	if (cmd->nargs == 1) {
+		/* if there is one argument, complete symbol name */
+		return(complete_symbol_name(cmd->args[0], 100));
+	} else {
+		/* if there are two or more arguments */
+		for (i = 0; i < cmd->nargs; i++) {
+			if (cmd->args[i] && !strcmp(cmd->args[i], "-f")) {
+				/* don't complete the word following "-f" */
+				return(PRINT_BEEP);
+			}
+		}
+		/* complete symbol name */
+		return(complete_symbol_name(cmd->args[cmd->nargs - 1], 100));
+	}
 }
diff -Naur lkcdutils/lcrash/cmds/cmds.c lkcdutils+argcompl/lcrash/cmds/cmds.c
--- lkcdutils/lcrash/cmds/cmds.c	Mon Aug 27 18:41:16 2001
+++ lkcdutils+argcompl/lcrash/cmds/cmds.c	Thu Nov 22 10:44:12 2001
@@ -20,6 +20,7 @@
 
 extern int findsym_cmd(command_t *), findsym_parse(command_t *);
 extern void findsym_help(command_t *), findsym_usage(command_t *);
+extern char *findsym_complete(command_t *);
 
 extern int help_cmd(command_t *), help_parse(command_t *);
 extern void help_help(command_t *), help_usage(command_t *);
@@ -79,54 +80,54 @@
 extern void whatis_help(command_t *), whatis_usage(command_t *);
 
 _command_t  cmdset[] = {
-	{"base", 0, base_cmd, base_parse, base_help, base_usage},
-	{"deftask", 0, deftask_cmd, deftask_parse, deftask_help, deftask_usage},
+	{"base", 0, base_cmd, base_parse, base_help, base_usage, NULL},
+	{"deftask", 0, deftask_cmd, deftask_parse, deftask_help, deftask_usage, NULL},
 	{"dt", "deftask" },
-	{"dis", 0, dis_cmd, dis_parse, dis_help, dis_usage},
+	{"dis", 0, dis_cmd, dis_parse, dis_help, dis_usage, NULL},
 	{"id", "dis" },
-	{"dump", 0, dump_cmd, dump_parse, dump_help, dump_usage},
+	{"dump", 0, dump_cmd, dump_parse, dump_help, dump_usage, NULL},
 	{"od", "dump" },
 	{"md", "dump" },
-	{"findsym", 0, findsym_cmd, findsym_parse, findsym_help, findsym_usage},
+	{"findsym", 0, findsym_cmd, findsym_parse, findsym_help, findsym_usage, findsym_complete},
 	{"fsym", "findsym"},
 	{"symbol", "findsym"},
-	{"help", 0, help_cmd, help_parse, help_help, help_usage},
+	{"help", 0, help_cmd, help_parse, help_help, help_usage, NULL},
 	{"?", "help" },
-	{"history", 0, 0, 0, history_help, history_usage},
+	{"history", 0, 0, 0, history_help, history_usage, NULL},
 	{"h", "history" },
-	{"ldcmds", 0, ldcmds_cmd, ldcmds_parse, ldcmds_help, ldcmds_usage},
+	{"ldcmds", 0, ldcmds_cmd, ldcmds_parse, ldcmds_help, ldcmds_usage, NULL},
 	{"livedump", 0, livedump_cmd, livedump_parse, 
-	 livedump_help, livedump_usage},
-	{"mmap", 0, mmap_cmd, mmap_parse, mmap_help, mmap_usage},
-	{"module", 0, module_cmd, module_parse, module_help, module_usage},
+	 livedump_help, livedump_usage, NULL},
+	{"mmap", 0, mmap_cmd, mmap_parse, mmap_help, mmap_usage, NULL},
+	{"module", 0, module_cmd, module_parse, module_help, module_usage, NULL},
 	{"namelist", 0, namelist_cmd, namelist_parse, namelist_help,
-	 namelist_usage},
+	 namelist_usage, NULL},
 	{"addtypes", "namelist"},
 	{"nmlist", "namelist" },
-	{"page", 0, page_cmd, page_parse, page_help, page_usage},
-	{"quit", 0, quit_cmd, quit_parse, quit_help, quit_usage},
+	{"page", 0, page_cmd, page_parse, page_help, page_usage, NULL},
+	{"quit", 0, quit_cmd, quit_parse, quit_help, quit_usage, NULL},
 	{"q", "quit"},
 	{"q!", "quit"},
-	{"report", 0, report_cmd, report_parse, report_help, report_usage},
-	{"sizeof", 0, sizeof_cmd, sizeof_parse, sizeof_help, sizeof_usage},
+	{"report", 0, report_cmd, report_parse, report_help, report_usage, NULL},
+	{"sizeof", 0, sizeof_cmd, sizeof_parse, sizeof_help, sizeof_usage, NULL},
 	{"offset", "sizeof"},
-	{"stat", 0, stat_cmd, stat_parse, stat_help, stat_usage},
-	{"strace", 0, strace_cmd, strace_parse, strace_help, strace_usage},
+	{"stat", 0, stat_cmd, stat_parse, stat_help, stat_usage, NULL},
+	{"strace", 0, strace_cmd, strace_parse, strace_help, strace_usage, NULL},
 	{"symtab", 0, symtab_cmd, symtab_parse, symtab_help,
-	 symtab_usage},
-	{"task", 0, task_cmd, task_parse, task_help, task_usage},
+	 symtab_usage, NULL},
+	{"task", 0, task_cmd, task_parse, task_help, task_usage, NULL},
 	{"ps", "task"},
-	{"trace", 0, trace_cmd, trace_parse, trace_help, trace_usage},
+	{"trace", 0, trace_cmd, trace_parse, trace_help, trace_usage, NULL},
 	{"t", "trace"},
 	{"bt", "trace"},
-	{"print", 0, print_cmd, print_parse, print_help, print_usage},
+	{"print", 0, print_cmd, print_parse, print_help, print_usage, NULL},
 	{"p", "print" },
 	{"pd", "print" },
 	{"px", "print" },
 	{"po", "print" },
 	{"pb", "print" },
-	{"vtop", 0, vtop_cmd, vtop_parse, vtop_help, vtop_usage},
-	{"walk", 0, walk_cmd, walk_parse, walk_help, walk_usage},
-	{"whatis", 0, whatis_cmd, whatis_parse, whatis_help, whatis_usage},
+	{"vtop", 0, vtop_cmd, vtop_parse, vtop_help, vtop_usage, NULL},
+	{"walk", 0, walk_cmd, walk_parse, walk_help, walk_usage, NULL},
+	{"whatis", 0, whatis_cmd, whatis_parse, whatis_help, whatis_usage, NULL},
 	{(char *)0 }
 };
diff -Naur lkcdutils/lcrash/cmds/command.c lkcdutils+argcompl/lcrash/cmds/command.c
--- lkcdutils/lcrash/cmds/command.c	Wed Nov 21 11:05:04 2001
+++ lkcdutils+argcompl/lcrash/cmds/command.c	Thu Nov 22 10:45:02 2001
@@ -5,6 +5,10 @@
 #include <setjmp.h>
 #include <strings.h>
 #include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <dirent.h>
+#include <unistd.h>
 
 cmd_rec_t *cmd_tree = (cmd_rec_t *)0;
 extern jmp_buf klib_jbuf;
@@ -39,6 +43,7 @@
 			cmd_rec->cmdparse = cmds[i].cmdparse;
 			cmd_rec->cmdhelp = cmds[i].cmdhelp;
 			cmd_rec->cmdusage = cmds[i].cmdusage;
+			cmd_rec->cmdcomplete = cmds[i].cmdcomplete;
 		}
 		ret = kl_insert_btnode((btnode_t **)&cmd_tree,
 				(btnode_t *)cmd_rec, 0);
@@ -697,6 +702,7 @@
 	 * not needed.
 	 */
 	clean_cmd(&cmd);
+	cmd.ofp = stdout;
 	strncpy(cline, inputline, tabpos + 1);
 	cline[tabpos] = '\0';
 	cmd.command = cline;
@@ -714,7 +720,6 @@
 		if (!(*(cmd.command))) {
 			/* if TAB is pressed at the head of a command name, 
 			 * display the list of command names */	
-			cmd.ofp = stdout;
 			fprintf(stdout, "\n");
 			help_list(&cmd); /* display the list of command names */
 			return(DRAW_NEW_ENTIRE_LINE);
@@ -724,10 +729,23 @@
 			return(complete_subcmd_name(cmd.command));
 		}
 	} else {
-		/* TAB is pressed on the command arguments */
+		/* TAB is pressed on the command argument */
 		/* call completion function for command arguments */
-		/* -- not implemented -- */
-		return(PRINT_BEEP);
+		cmd_rec_t *crec;
+		cmdcomplete_t cfunc;
+
+		/* get internal data for cmd.command */
+		if ((crec = find_cmd_rec(cmd.command)) == NULL) {
+			/* bad command name */
+			return(PRINT_BEEP);
+		}
+		cfunc = crec->real_cmd ? crec->real_cmd->cmdcomplete : crec->cmdcomplete;
+		if (cfunc) {
+			/* call completion function for command arguments */
+			return(cfunc(&cmd));
+		} else {
+			return(PRINT_BEEP);
+		}
 	}
 }
 
@@ -826,4 +844,306 @@
 	}
 	fflush(stdout);
 	return(DRAW_NEW_ENTIRE_LINE);
+}
+
+/*
+ * complete_standard_options() -- This function completes the standard options 
+ *                                argument. 
+ *                                If there is a standard option, it returns 
+ *                                with the value which the completion function 
+ *                                returned.
+ *                                If there is no standard option, it returns 
+ *                                NOT_COMPLETED.  
+ */
+char *
+complete_standard_options(command_t *cmd)
+{
+	if (cmd->nargs > 1) {
+		if (!strcmp(cmd->args[cmd->nargs - 2], "-w")) {
+			/* if previous word is "-w", complete file name */
+			return(complete_file_name(cmd->args[cmd->nargs -1], 100));
+		} else {
+			return(NOT_COMPLETED);
+		}
+	} else {
+		return(NOT_COMPLETED);
+	}
+}
+
+
+/*
+ * complete_symbol_name() -- This function completes 'keystr' as symbol name.
+ *                           When there is no candidate, return PRINT_BEEP. 
+ *                           When there is a candidate, return the string.
+ *                           When there are two or more candidates, return 
+ *                           the identical part of string of them. 
+ *                           When there isn't the identical part of string, 
+ *                           display the list of candidates and return  
+ *                           DRAW_NEW_ENTIRE_LINE. 		
+ *                           If number of the candidates is more 
+ *                           'print_max_candt', ask whether display or not. 
+ */
+char *
+complete_symbol_name(char *keystr, int print_max_candt)
+{
+	syment_t	*sq_cur, *sq_head;
+	static char retstr[KL_SYMBOL_NAME_LEN];
+	int	candtcnt = 0;
+	int i;
+	int	candt_maxlen;
+	int	print_column;
+	char print_str[8];
+	struct winsize w;
+
+	if (!keystr) {
+		keystr = "";
+	}
+	/* get que of the candidates for symbol name */
+	sq_head = kl_get_similar_name(keystr, retstr, &candtcnt, &candt_maxlen);
+	if (candtcnt == 0) {
+		/* if there is no candidate, return PRINT_BEEP */
+		return(PRINT_BEEP);
+	} else if (candtcnt == 1) {
+		/* if there is a candidate, return string to complete */
+		strcat(retstr, " ");
+		return(retstr);
+	} else { /* candtcnt is 2 or more */
+		if (retstr[0] == '\0') {
+			/* if there is no the identical part of string, print the list of 
+			   candidates */
+			if (print_max_candt && candtcnt >= print_max_candt) {
+				/* if there are number of "print_max_candt" or more candidates, 
+				   ask whether diaplay or not */
+				for (;;) {
+					int c;
+					fprintf(stdout, 
+						"\nDisplay all %d possibilities? (y or n)", candtcnt);
+					c = getc(stdin);
+					if (c == 'y' || c == 'Y') {
+						break;
+					} else if (c == 'n' || c == 'N') {
+						fprintf(stdout, "\n");
+						return(DRAW_NEW_ENTIRE_LINE);
+					} else {
+						continue;
+					}
+				}
+			}
+			/* print list of candidates */
+			/* get the window size */
+			if (ioctl(fileno(stdout), TIOCGWINSZ, &w) < 0) {
+				w.ws_col = 80;
+			}
+			/* get number of the columns suited for printing candidates */ 
+			if (!(print_column = w.ws_col / (candt_maxlen + 1))) {
+				print_column = 1;
+			}
+			sprintf(print_str, "%%-%ds", 
+				(w.ws_col < candt_maxlen + 1) ? w.ws_col : candt_maxlen + 1); 
+			sq_cur = sq_head;
+			for (i = 0; i < candtcnt || sq_cur; i++) {
+				if (i % print_column == 0) {
+					fprintf(stdout, "\n");
+				}
+				fprintf(stdout, print_str, sq_cur->s_name);
+				sq_cur = sq_cur->s_forward;
+			}
+			fprintf(stdout, "\n");
+			fflush(stdout);
+			return(DRAW_NEW_ENTIRE_LINE);
+		} else {
+			/* if there is the identical part of string, return string to
+			   complete */
+			return(retstr);
+		}
+	}
+}
+
+/*
+ * complete_file_name() --  This function completes 'string' as file name.
+ *                          When there is no candidate, return PRINT_BEEP. 
+ *                          When there is a candidate, return the string.
+ *                          When there are two or more candidates, return the 
+ *                          identical part of string of them. 
+ *                          When there isn't the identical part of string, 
+ *                          display the list of candidates and return  
+ *                          DRAW_NEW_ENTIRE_LINE. 		
+ *                          If number of the candidates is more 
+ *                          'print_max_candt', ask whether display or not. 
+ */
+char *
+complete_file_name(char *string, int print_max_candt)
+{
+	char *last_slash_pos;
+	static char dirname[DEF_LENGTH];
+	static char keystr[NAME_MAX + 1], retstr[NAME_MAX + 1];
+	int	candt_maxlen;
+	int	 dirlen, keylen;
+	DIR *dp;
+	struct dirent *dent;
+	int candtcnt = 0;
+	int	i;
+	struct stat sbuf;
+	struct candt_que_s {
+		struct candt_que_s *next;
+		char str[1];
+	} *q_head, *q_tail, *q_cur;
+	int	print_column;
+	char print_str[8];
+	struct winsize w;
+	char *ret;
+
+	/* string is '\0' */
+	if (!string || string[0] == '\0') {
+		strcpy(dirname, "./");
+		strcpy(keystr, "");
+	} else {
+		/* get position of last slash */
+		last_slash_pos = strrchr(string, '/');
+		if (last_slash_pos == NULL) {
+			/* search current directory */
+			strcpy(dirname, "./");
+			strcpy(keystr, string);
+		} else {
+			/* get search directory */
+			dirlen = last_slash_pos - string + 1;
+			strncpy(dirname, string, dirlen);
+			dirname[dirlen] = '\0';
+			/* get key string to complete */
+			strcpy(keystr, string+dirlen); 
+		}
+	}
+	keylen = strlen(keystr);
+	/* open search directory */
+	if ((dp = opendir(dirname)) == NULL) {
+		ret = PRINT_BEEP;
+		goto out;
+	}
+
+	/* initialize q_head */
+	q_head = 0;	
+
+	/* get the queue of candidates for file name */ 
+	while ((dent = readdir(dp)) != NULL) {
+		if (!keylen || !strncmp(dent->d_name, keystr, keylen)) {
+			/* allocate memory for candidates queue */
+			q_cur = kl_alloc_block(sizeof(struct candt_que_s) + 
+				strlen(dent->d_name), K_PERM);
+			if (klib_error) {
+				fprintf(KL_ERRORFP, 
+					"Could not allocate memory for file name completion\n");
+				/* free memory for queue of candidates and return with value of 
+				   'ret'*/
+				ret = PRINT_BEEP;
+				goto out;
+			}
+			strcpy(q_cur->str, dent->d_name);
+			if (candtcnt == 0) {
+				q_head = q_tail = q_cur;
+				/* save return string to 'retstr' */
+				strcpy(retstr, q_cur->str+keylen);
+				/* get max length of candidates */
+				candt_maxlen = strlen(q_cur->str);
+			} else {
+				q_tail->next = q_cur;
+				q_tail = q_cur;
+				if (retstr[0] != '\0') {
+					/* get the identical part of string of candidates and save 
+					   to 'retstr' */
+					for (i = 0; retstr[i] != '\0' && 
+						retstr[i] == *(q_cur->str+keylen+i); i++);
+					retstr[i] = '\0';
+				}
+				/* get max length of candidates */
+				if (candt_maxlen < strlen(q_cur->str)) {
+					candt_maxlen = strlen(q_cur->str);
+				}
+			}
+			q_tail->next = 0;
+			candtcnt++;
+		}
+	}
+	closedir(dp);
+
+	if (candtcnt == 0) {
+		ret = PRINT_BEEP;
+		goto out;
+	} else if (candtcnt == 1) {
+		/* check if file is directory */
+		strcat(dirname, q_head->str);
+		stat(dirname, &sbuf);
+		if (S_ISDIR(sbuf.st_mode)) {
+			strcat(retstr, "/");
+		} else {
+			strcat(retstr, " ");
+		}
+		/* return string to complete */
+		ret = retstr;
+		/* free memory for queue of candidates and return with value of 'ret'*/
+		goto out;
+	} else { /* candtcnt >= 2 */
+		if (retstr[0] == '\0') {
+			/* if there is no the identical part of string, print the list of 
+			   candidates */
+			if (print_max_candt && candtcnt >= print_max_candt) {
+				/* if there are number of "print_max_candt" or more candidates,
+				   ask whether display or not */
+				for (;;) {
+					int c;
+					fprintf(stdout, "\nDisplay all %d possibilities? (y or n)",
+						candtcnt);
+					c = getc(stdin);
+					if (c == 'y' || c == 'Y') {
+						break;
+					} else if (c == 'n' || c == 'N') {
+						fprintf(stdout, "\n");
+						/* free memory for queue of candidates and return with 
+						   value of 'ret' */
+						ret = DRAW_NEW_ENTIRE_LINE; 
+						goto out;
+					} else {
+						continue;
+					}
+				}
+			}
+			/* print list of candidates */
+			/* get the window size */
+			if (ioctl(fileno(stdout), TIOCGWINSZ, &w) < 0) {
+				w.ws_col = 80;
+			}
+			/* get number of the columns suited for printing candidates */ 
+			if (!(print_column = w.ws_col / (candt_maxlen + 1)))
+				print_column = 1;
+			sprintf(print_str, "%%-%ds", 
+				(w.ws_col < candt_maxlen + 1) ? w.ws_col : candt_maxlen + 1); 
+			q_cur = q_head;
+			for (i = 0; i < candtcnt || q_cur; i++) {
+				if (i % print_column == 0) {
+					fprintf(stdout, "\n");
+				}
+				fprintf(stdout, print_str, q_cur->str);
+				q_cur = q_cur->next;
+			}
+			fprintf(stdout, "\n");
+			fflush(stdout);
+			/* free memory for queue of candidates and return with value of 
+			   'ret'*/
+			ret = DRAW_NEW_ENTIRE_LINE;
+			goto out;
+		} else {
+			/* if there is the identical part of string, return string to
+			   complete */
+			/* free memory for queue of candidates and return with value of 
+			   'ret'*/
+			ret = retstr;
+			goto out;
+		}
+	}
+out:
+	while (q_head) {
+		q_cur = q_head;
+		q_head = q_head->next;
+		kl_free_block(q_cur);
+	}
+	return(ret);
 }
diff -Naur lkcdutils/lcrash/include/command.h lkcdutils+argcompl/lcrash/include/command.h
--- lkcdutils/lcrash/include/command.h	Wed Nov 21 11:05:06 2001
+++ lkcdutils+argcompl/lcrash/include/command.h	Thu Nov 22 10:47:42 2001
@@ -32,6 +32,7 @@
 typedef int(*cmdparse_t) (command_t *);
 typedef void(*cmdhelp_t) (command_t *);
 typedef void(*cmdusage_t) (command_t *);
+typedef char *(*cmdcomplete_t) (command_t *);
 
 typedef struct _command {
 	char           *cmd;  	  /* command name */
@@ -40,6 +41,7 @@
 	cmdparse_t 	cmdparse; /* argument parsing function  */
 	cmdhelp_t 	cmdhelp;  /* help function */
 	cmdusage_t	cmdusage; /* usage string function */
+	cmdcomplete_t	cmdcomplete; /* completion function */
 } _command_t;
 
 extern _command_t cmdset[];
@@ -57,6 +59,7 @@
         cmdparse_t              cmdparse;  /* argument parsing function  */
         cmdhelp_t               cmdhelp;   /* help function */
         cmdusage_t              cmdusage;  /* usage string function */
+        cmdcomplete_t           cmdcomplete; /* completion function */
 } cmd_rec_t;
 
 #define cmd_name bt.bt_key
@@ -162,3 +165,9 @@
 int register_cmds(_command_t *);
 char *complete_cmds(char *, int);
 char *complete_subcmd_name(char *);
+char *complete_standard_options(command_t *);
+char *complete_symbol_name(char *, int);
+char *complete_file_name(char *, int);
+
+#define NOT_COMPLETED	(char *)-2
+
diff -Naur lkcdutils/libklib/include/kl_sym.h lkcdutils+argcompl/libklib/include/kl_sym.h
--- lkcdutils/libklib/include/kl_sym.h	Mon Aug 27 18:41:16 2001
+++ lkcdutils+argcompl/libklib/include/kl_sym.h	Thu Nov 22 10:48:11 2001
@@ -14,6 +14,7 @@
 	struct syment_s        *s_next;  /* For linked lists */ 
 	kaddr_t			s_addr;  /* vaddr of symbol */
 	int			s_type;  /* text, data */
+	struct syment_s		*s_forward; /* For linked lists */
 } syment_t;
 
 #define s_name s_bt.bt_key
@@ -80,6 +81,7 @@
 int kl_init_ksyms(int);
 int kl_read_syminfo(char *, maplist_t *, int);
 void kl_free_syminfo(char *);
+syment_t *kl_get_similar_name(char *, char *, int *, int *);
 syment_t *kl_lkup_symname(char *);
 syment_t *_kl_lkup_symname(char *, int, size_t len);
 #define KL_LKUP_SYMNAME(NAME, TYPE, LEN) _kl_lkup_symname(NAME, TYPE, LEN)
diff -Naur lkcdutils/libklib/kl_symbol.c lkcdutils+argcompl/libklib/kl_symbol.c
--- lkcdutils/libklib/kl_symbol.c	Wed Nov 21 11:05:07 2001
+++ lkcdutils+argcompl/libklib/kl_symbol.c	Thu Nov 22 10:56:46 2001
@@ -604,3 +604,69 @@
 	}
 	return(0);
 }
+
+/*
+ * kl_get_similar_name() -- This function gets the queue of symbol names which
+ *                           match 'name' for symbol name completion. 
+ *                           It saves the number of candidates to 'sym_cnt'. 
+ *                           It saves the maximum length of the candidates to 
+ *                           'maxlen'.
+ *                           It saves a string for completion to 'retstr'.
+ *                           Return the head of queue.
+ */
+syment_t *
+kl_get_similar_name(char *name, char *retstr, int *sym_cnt, int *maxlen) 
+{
+	syment_t *sq_cur, *sq_head, *sq_tail;
+	int	namelen;
+	int found;
+	maplist_t *ml;
+	int i;
+
+	/* Search for name in the types list
+	 */
+	kl_reset_error();
+	KL_ERROR = KLE_BAD_SYMNAME;
+	namelen = strlen(name);
+	for(ml=STP; ml!=NULL; ml=ml->next){
+		if (ml->maplist_type == SYM_MAP_KSYM || ml->maplist_type == SYM_MAP_FILE) {
+			/* get the queue of candidates for symbol name */
+			sq_cur = (syment_t *)kl_first_btnode(ml->syminfo->symnames);
+			found = 0;
+			do {
+				if (!namelen || !strncmp(name, sq_cur->s_name, namelen)) {
+					if (!found)
+						found = 1;
+					if (!*sym_cnt) {
+						strcpy(retstr, sq_cur->s_name+namelen);
+						*maxlen = strlen(sq_cur->s_name);
+						sq_head = sq_tail = sq_cur;
+					} else {
+						sq_tail->s_forward = sq_cur;
+						sq_tail = sq_cur;
+						if (retstr[0] != '\0') {
+							/* get the identical part of string of 
+							   candidates and save to 'retstr' */
+							for (i = 0; retstr[i] != '\0' &&
+								retstr[i] == *(sq_cur->s_name+namelen+i);
+								i++); 
+							retstr[i] = '\0';
+						}
+						/* get the maximum length of candidates and save to 'maxlen' */
+						if (*maxlen < strlen(sq_cur->s_name)) {
+							*maxlen = strlen(sq_cur->s_name);
+						}
+					}
+					sq_tail->s_forward = (syment_t *)0;
+					(*sym_cnt)++;
+				} else {
+					if (found)
+						break;
+				}
+			} while ((sq_cur = (syment_t *)kl_next_btnode((btnode_t *)sq_cur)) != NULL);
+		} else {
+			continue;
+		}
+	}
+	return(sq_head); /* return the head of queue */
+}


From vamsi@in.ibm.com Tue Nov 27 00:34:15 2001
Received: from e31.co.us.ibm.com ([32.97.110.129] helo=e31.bld.us.ibm.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 168dgy-0008Ia-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 27 Nov 2001 00:34:12 -0800
Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.99.140.24])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id DAA68354;
	Tue, 27 Nov 2001 03:31:18 -0500
Received: from vamsiks.in.ibm.com (vamsiks.in.ibm.com [9.186.133.18])
	by westrelay03.boulder.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fAR8XgY72130;
	Tue, 27 Nov 2001 01:33:43 -0700
Received: (from vamsi@localhost)
	by vamsiks.in.ibm.com (8.11.2/8.11.2) id fAR90J008368;
	Tue, 27 Nov 2001 14:30:19 +0530
From: "Vamsi Krishna S ." <vamsi@in.ibm.com>
To: lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net
Cc: bharata <bharata@in.ibm.com>, suparna <bsuparna@in.ibm.com>,
        subodh <subodh@in.ibm.com>
Message-ID: <20011127143019.A8322@in.ibm.com>
Reply-To: vamsi@in.ibm.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Subject: [lkcd-general] [PATCH]capturing registers/stack on all processors
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 27 00:35:02 2001
X-Original-Date: Tue, 27 Nov 2001 14:30:19 +0530

Hello, 

Here is a patch against lkcd cvs (as on 11/26/2001) for capturing 
registers on all processors at the time of dumping. 

This has been found to be crucial to debug problems where some of 
the cpus on an SMP are hung (executing a tight loop, interrupts 
disabled).

We send an NMI-class IPI to other cpus to capture the registers
and stack. This is the only guaranteed way to ensure that other 
cpus respond. If they don't respond to NMI, there is absolutely 
nothing we can do in software. 

We need to capture the stack, even though we would prefer not to. 
The reason being that the stack could change between the time the 
registers are captured and the time that page is written out in 
the dumping process. The chages in the stack could be so 
significant as to render backtracing impossible/totally inaccurate. 

Currently, all the changes we made are specific to i386, even
though many of the changes could have been arch-independent. 

Brief list of chages:

kernel:
- extensions to dump_header_asm_t to add fields to capture:
	- smp_num_cpus and dumping_cpu
	- registers of all processors
	- pointers to current tasks
	- pointers to the location where stacks are saved
- remove __dump_save_panic_regs
- collect registers in panic()
- remove all use of dha_esp, dha_eip, dha_regs and use 
  dha_smp_regs consistantly
- cleanup dump_configure_header handling, ie, do it only
  once in dump_execute
- send NMI to all processors and capture their registers,
  current task and kernel stack as part of
  __configure_dump_header
- [bonus] new magic sysrq key 'd' to show the registers
  and, backtrace if inside kernel, on all processors
- [side effect] as part of capturing registers on panic
  we now seem to be able to backtrace correctly in 
  panic dump cases.

lcrash:
- new commands
	- rd
	- defcpu
- rd to display registers captured at the time of taking the
  dump on the processor which is currently the defcpu
- defcpu to set the default cpu and set deftask to the current
  task on that cpu at the time of dump
- new kl_smp_dumptask to determine while backtracing if this
  task is a current task on any of the processors at the time
  of dump
- changes to kl_dumpesp/kl_dumpeip to get the esp/eip values
  from dha_smp_regs.
- changes to get_block() to look at the saved stack if this
  task is a current task on any of the processors and was 
  inside the kernel when the dump was taken
- [unrelated bug fix] fix lkcd_config.c to pass the values of
  dump level, dump flags and compression_type instead of their
  addresses to the ioctl call to set them.


-- 
LKCD Team India
Linux Technology Center,
IBM Software Lab, Bangalore.
Ph: +91 80 5044959
Internet: vamsi@in.ibm.com

--

diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c
--- lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c	Mon Sep 24 15:31:42 2001
+++ lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c	Mon Nov 26 14:03:33 2001
@@ -31,8 +31,6 @@
 
 extern void dump_thread(struct pt_regs *, struct user *);
 extern spinlock_t rtc_lock;
-extern irq_desc_t irq_desc[];
-extern unsigned long irq_affinity[];
 
 #if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
 extern void machine_real_restart(unsigned char *, int);
@@ -150,8 +148,6 @@
 #endif
 
 EXPORT_SYMBOL(get_wchan);
-EXPORT_SYMBOL(irq_affinity);
-EXPORT_SYMBOL(irq_desc);
 
 EXPORT_SYMBOL(rtc_lock);
 
@@ -164,4 +160,17 @@
 
 #ifdef CONFIG_X86_PAE
 EXPORT_SYMBOL(empty_zero_page);
+#endif
+
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+extern irq_desc_t irq_desc[];
+extern unsigned long irq_affinity[];
+EXPORT_SYMBOL(irq_affinity);
+EXPORT_SYMBOL(irq_desc);
+#ifdef CONFIG_SMP
+extern void dump_send_ipi(void);
+EXPORT_SYMBOL(dump_send_ipi);
+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
+EXPORT_SYMBOL(dump_ipi_function_ptr);
+#endif
 #endif
diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c lkcd_cvs_new/2.4/arch/i386/kernel/smp.c
--- lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c	Tue Oct 16 12:51:44 2001
+++ lkcd_cvs_new/2.4/arch/i386/kernel/smp.c	Mon Nov 26 14:06:02 2001
@@ -142,6 +142,15 @@
 	 */
 	cfg = __prepare_ICR(shortcut, vector);
 
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	if (vector == DUMP_VECTOR) {
+		/*
+		 * Setup DUMP IPI to be delivered as an NMI
+		 */
+		cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
+	}
+#endif	/* CONFIG_DUMP */
+
 	/*
 	 * Send the IPI. The write to APIC_ICR fires this off.
 	 */
@@ -424,6 +433,13 @@
 
 	do_flush_tlb_all_local();
 }
+
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+void dump_send_ipi(void)
+{
+	send_IPI_allbutself(DUMP_VECTOR);
+}
+#endif	
 
 /*
  * this function sends a 'reschedule' IPI to another CPU.
diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c lkcd_cvs_new/2.4/arch/i386/kernel/traps.c
--- lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c	Wed Sep 26 15:16:15 2001
+++ lkcd_cvs_new/2.4/arch/i386/kernel/traps.c	Mon Nov 26 16:46:48 2001
@@ -89,6 +89,105 @@
 
 int kstack_depth_to_print = 24;
 
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+/* 
+ * This code mimics show_trace() etc in arch/i386/kernel/traps.c. We don't 
+ * use them directly as they depend on 8K aligned kernel stacks that our
+ * saved stacks don't satisfy. However, there is move to relax the requirement
+ * on task_struct to be 8K-aligned. Once that happens, we could simpify this
+ * function.
+ */
+void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk)
+{
+	int i;
+	unsigned long *esp;
+	unsigned char *c;
+	int in_kernel = 1;
+
+	esp = (unsigned long *)regs->esp;
+	c = (unsigned char *)regs->eip;
+
+	if (regs->xcs & 3) {
+		in_kernel = 0;
+	}
+	printk("CPU:    %d\nEIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
+		cpu, 0xffff & regs->xcs, regs->eip, regs->eflags);
+	printk("eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
+		regs->eax, regs->ebx, regs->ecx, regs->edx);
+	printk("esi: %08lx   edi: %08lx   ebp: %08lx   esp: %p\n",
+		regs->esi, regs->edi, regs->ebp, esp);
+	printk("ds: %04x   es: %04x   ss: %04x\n",
+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
+	if (!tsk) {
+		printk("no stack for this cpu\n");
+		return;
+	}
+	printk("Process %s (pid: %d, stackpage=%08lx)",
+		tsk->comm, tsk->pid, 4096+(regs->esp & ~(THREAD_SIZE-1)));
+	/*
+	 * When in-kernel, we also print out the stack and code at the
+	 * time of the fault..
+	 */
+	if (in_kernel) {
+		unsigned long *stack;
+		unsigned long addr, module_start, module_end;
+		extern char _stext, _etext;
+
+		extern int kstack_depth_to_print;
+
+		esp = (unsigned long *)((unsigned long)tsk + (regs->esp & (THREAD_SIZE-1)));
+
+		printk("\nStack: ");
+		stack = esp;
+		for(i=0; i < kstack_depth_to_print; i++) {
+			if ((unsigned long)stack > (unsigned long)tsk + THREAD_SIZE-1)
+				break;
+			if (i && ((i % 8) == 0))
+				printk("\n       ");
+			printk("%08lx ", *stack++);
+		}
+		
+		printk("\nCall Trace: ");
+		i = 1;
+		stack = esp;
+		module_start = VMALLOC_START;
+		module_end = VMALLOC_END;
+		module_end = 0;
+		while ((unsigned long)stack < (unsigned long)tsk + THREAD_SIZE) {
+			addr = *stack++;
+			/*
+			 * If the address is either in the text segment of the
+			 * kernel, or in the region which contains vmalloc'ed
+			 * memory, it *may* be the address of a calling
+			 * routine; if so, print it so that someone tracing
+			 * down the cause of the crash will be able to figure
+			 * out the call path that was taken.
+			 */
+			if (((addr >= (unsigned long) &_stext) &&
+			     (addr <= (unsigned long) &_etext)) ||
+			    ((addr >= module_start) && (addr <= module_end))) {
+				if (i && ((i % 8) == 0))
+					printk("\n       ");
+				printk("[<%08lx>] ", addr);
+				i++;
+			}
+		}
+		printk("\n");
+
+		printk("\nCode: ");
+		if(regs->eip < PAGE_OFFSET) {
+			printk("eip in user space. error.\n");
+		}
+
+		for(i=0;i<20;i++) {
+			printk("%02x ", *c++);
+		}
+	}
+	printk("\n");
+	return;
+}	
+#endif /* CONFIG_DUMP */
+
 /*
  * These constants are for searching for possible module text
  * segments.
@@ -471,12 +570,33 @@
 }
 #endif
 
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+#ifdef CONFIG_SMP
+int (*dump_ipi_function_ptr)(struct pt_regs *) = NULL;
+static int dump_ipi(struct pt_regs *regs)
+{
+	if (!(dump_ipi_function_ptr && dump_ipi_function_ptr(regs))) {
+		return 0;
+	}
+	ack_APIC_irq();
+	return 1;
+}	
+#else
+#define dump_ipi(regs) 0
+#endif
+#endif
+
 asmlinkage void do_nmi(struct pt_regs * regs, long error_code)
 {
 	unsigned char reason = inb(0x61);
 
 
 	++nmi_count(smp_processor_id());
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	if (dump_ipi(regs)) {
+		return;
+	}
+#endif
 	if (!(reason & 0xc0)) {
 #if CONFIG_X86_IO_APIC
 		/*
diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/char/sysrq.c lkcd_cvs_new/2.4/drivers/char/sysrq.c
--- lkcd_cvs_orig/2.4/drivers/char/sysrq.c	Fri Nov 23 17:25:29 2001
+++ lkcd_cvs_new/2.4/drivers/char/sysrq.c	Tue Nov 27 14:01:47 2001
@@ -96,6 +96,15 @@
 		dump("sysrq", pt_regs);
 		break;
 #endif
+#if defined(CONFIG_DUMP)
+	case 'd':
+		{
+		extern void show_cpu_state(struct pt_regs *);
+		printk("Show state of all cpus\n");
+		show_cpu_state(pt_regs);
+		break;
+		}
+#endif
 
 	case 'o':					    /* O -- power off */
 		if (sysrq_power_off) {
diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_base.c lkcd_cvs_new/2.4/drivers/dump/dump_base.c
--- lkcd_cvs_orig/2.4/drivers/dump/dump_base.c	Fri Nov 23 17:25:30 2001
+++ lkcd_cvs_new/2.4/drivers/dump/dump_base.c	Mon Nov 26 16:34:44 2001
@@ -268,14 +268,12 @@
 extern struct new_utsname system_utsname;     /* system information        */
 
 /* external architecture-specific functions */
-extern void __dump_open(struct file *, uint64_t);
+extern void __dump_open(void);
+extern void __dump_cleanup(void);
 extern void __dump_init(uint64_t);
-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
+extern int __dump_configure_header(struct pt_regs *);
 extern unsigned int  __dump_silence_system(unsigned int);
 extern unsigned int  __dump_resume_system(unsigned int);
-#ifdef CONFIG_X86
-extern void __dump_save_panic_regs(dump_header_asm_t *);
-#endif
 
 /* external functions                                                      */
 extern void si_meminfo(struct sysinfo *);
@@ -736,7 +734,7 @@
 	}
 
 	/* configure architecture-specific dump header values */
-	if (!__dump_configure_header(&dump_header_asm, regs)) {
+	if (!__dump_configure_header(regs)) {
 		return (0);
 	}
 	return (1);
@@ -792,16 +790,11 @@
  *       memory pages and dumps the data to disk (using other functions).
  */
 static int
-dump_execute_memdump(char *panic_str, struct pt_regs *regs)
+dump_execute_memdump(void)
 {
 	int counter = 0, state = 0;
 	unsigned long mem_loc, buf_loc;
 
-	if (!dump_configure_header(panic_str, regs)) {
-		DUMP_PRINT("Dump header could not be configured!");
-		return (-1);
-	}
-
 	DUMP_PRINT("\nDump compression value is 0x%x ...", dump_compress);
 
 	DUMP_PRINT("\nWriting dump header ...");
@@ -939,39 +932,23 @@
 		return;
 	}
 
+	if(!dump_configure_header(panic_str, regs)) {
+		DUMP_PRINT("\ndump header could not be configured!");
+		return;
+	}
+
 	/* silence the system */
 	dump_silence_system();
 
 	/* bail out if we're not going to do any dumping */
 	if (dump_level != DUMP_LEVEL_NONE) {
 		/* inform users of what we are about to do */
-#ifdef CONFIG_SMP
 		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
 			dump_device, bdevname(dump_device),
 			smp_processor_id());
-#else
-		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
-			dump_device, bdevname(dump_device),
-			0);
-#endif
 
 		/* start walking through the page tables */
-		state = dump_execute_memdump(panic_str, regs);
-
-#ifdef CONFIG_X86
-		/*
-		 * Okay, this is REALLY annoying to have to
-		 * do.  What this means is that for x86
-		 * systems, we have to literally save the
-		 * esp/eip _now_, because we don't want the
-		 * esp/eip from dump_write_header() or
-		 * anything it calls to conflict with
-		 * re-building the panic() stack trace case.
-		 * So for that reason, we save the eip/esp
-		 * now so we can re-build the trace later.
-		 */
-		__dump_save_panic_regs(&dump_header_asm);
-#endif
+		state = dump_execute_memdump();
 
 		/* update header to disk for the last time */
 		if (dump_write_header() < 0) {
@@ -1054,7 +1031,7 @@
 	struct list_head *tmp;
 	dump_compress_t *dc;
 
-	/* try to remove the compression item */
+	/* try to set the compression type*/
 	list_for_each(tmp, &dump_compress_list) {
 		dc = list_entry(tmp, dump_compress_t, list);
 		if (dc->compress_type == compression_type) {
@@ -1210,6 +1187,7 @@
 			if (!(f->f_flags & O_RDWR)) {
 				return (-EPERM);
 			}
+			__dump_open();
 			return (dump_open_kdev((kdev_t)arg));
 
 		/* get dump_device */
@@ -1423,6 +1401,9 @@
 	if (dump_page_buf) {
 		kfree((const void *)dump_page_buf);
 	}
+
+	/* arch-specific cleanup routine */
+	__dump_cleanup();
 
 	/* remove the proc entries */
 	dump_proc_cleanup();
diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c lkcd_cvs_new/2.4/drivers/dump/dump_i386.c
--- lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c	Thu Oct  4 14:19:49 2001
+++ lkcd_cvs_new/2.4/drivers/dump/dump_i386.c	Tue Nov 27 14:15:26 2001
@@ -21,50 +21,143 @@
 #include <linux/kernel.h>
 #include <linux/smp.h>
 #include <linux/fs.h>
+#include <linux/vmalloc.h>
 #include <linux/dump.h>
 #include <linux/mm.h>
 #include <asm/processor.h>
 #include <asm/hardirq.h>
 #include <linux/irq.h>
 
-extern volatile int dump_in_progress;
-extern unsigned long irq_affinity[NR_IRQS];
 static unsigned long saved_affinity[NR_IRQS];
 
-/*
- * Name: __dump_save_panic_regs()
- * Func: Save the EIP (really the RA).  We may pass an argument later.
- * 	 Save ESP also here. 
- */
-inline void 
-__dump_save_panic_regs(dump_header_asm_t *dha)
-{
-	__asm__ __volatile__("movl  %%esp, %0\n"
-		: "=r" (dha->dha_esp));
-	/* hate to do this, but ... */
-#ifdef CONFIG_FRAME_POINTER
-	__asm__ __volatile__("movl  4(%%esp), %0\n"
-		: "=r" (dha->dha_eip));
+static int alloc_dha_stack(void)
+{
+	int i;
+	void *ptr;
+	
+	if (dump_header_asm.dha_stack[0])
+		return 0;
+
+       	ptr = vmalloc(THREAD_SIZE * smp_num_cpus);
+	if (!ptr) {
+		printk("vmalloc for dha_stacks failed\n");
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < smp_num_cpus; i++) {
+		dump_header_asm.dha_stack[i] = (void *)((unsigned long)ptr + (i * THREAD_SIZE));
+	}
+	return 0;
+}
+
+static int free_dha_stack(void) 
+{
+	if (dump_header_asm.dha_stack[0])
+		vfree(dump_header_asm.dha_stack[0]);	
+	return 0;
+}
+
+/* In case of panic dumps, we collects regs on entry to panic.
+ * so, we shouldn't 'fix' ssesp here again. But it is hard to
+ * tell just looking at regs whether ssesp need fixing. We make
+ * this decision by looking at xss in regs. If we have better
+ * means to determine that ssesp are valid (by some flag which
+ * tells that we are here due to panic dump), then we can use
+ * that instead of this kludge.
+ */
+static inline void 
+fix_ssesp(struct pt_regs *regs, int cpu)
+{
+	if (!user_mode(regs)) {
+		if ((cpu == dump_header_asm.dha_dumping_cpu) &&
+			regs->xss == __KERNEL_DS) 
+			return;
+		dump_header_asm.dha_smp_regs[cpu].esp = 
+				(unsigned long)&(regs->esp);
+		__asm__ __volatile__ ("movw %%ss, %%ax;"
+			:"=a"(dump_header_asm.dha_smp_regs[cpu].xss));
+	}
+}
+
+static void
+save_this_cpu_state(int cpu, struct pt_regs *regs, struct task_struct *tsk)
+{
+	dump_header_asm.dha_smp_regs[cpu] = *regs;
+	dump_header_asm.dha_smp_current_task[cpu] = tsk;
+	fix_ssesp(regs, cpu);
+
+	if (dump_header_asm.dha_stack[cpu]) {
+		memcpy(dump_header_asm.dha_stack[cpu], tsk, THREAD_SIZE);
+	}
+	return;
+}
+
+#ifdef CONFIG_SMP
+static int dump_expect_ipi[NR_CPUS];
+static atomic_t waiting_for_dump_ipi;
+static int wait_for_dump_ipi = 1; /* always wait for ipi to to be handled */
+
+static int
+dump_ipi_handler(struct pt_regs *regs) 
+{
+	int cpu = smp_processor_id();
+
+	if (!dump_expect_ipi[cpu]) {
+		return 0;
+	}
+	
+	save_this_cpu_state(cpu, regs, current);
+
+	dump_expect_ipi[cpu] = 0;
+	atomic_dec(&waiting_for_dump_ipi);
+	return 1;
+}
+
+/* save registers on other processors */
+void 
+save_other_cpu_states(void)
+{
+	int i;
+
+	if (smp_num_cpus > 1) {
+		atomic_set(&waiting_for_dump_ipi, smp_num_cpus-1);
+		for (i = 0; i < NR_CPUS; i++)
+			dump_expect_ipi[i] = 1;
+		
+		dump_ipi_function_ptr = dump_ipi_handler;
+		dump_send_ipi();
+		/* may be we dont need to wait for NMI to be processed. 
+		   just write out the header at the end of dumping, if
+		   this IPI is not processed untill then, there probably
+		   is a problem and we just fail to capture state of 
+		   other cpus. */
+		if (wait_for_dump_ipi) {
+			while(atomic_read(&waiting_for_dump_ipi))
+				barrier();
+			dump_ipi_function_ptr = NULL;
+		}
+	}
+	return;
+}
 #else
-	__asm__ __volatile__("movl  (%%esp), %0\n"
-		: "=r" (dha->dha_eip));
+#define save_other_cpu_states()
 #endif
-}
 
 /*
  * Name: __dump_configure_header()
  * Func: Configure the dump header with all proper values.
  */
 int
-__dump_configure_header(dump_header_asm_t *dha, struct pt_regs *regs)
+__dump_configure_header(struct pt_regs *regs)
 {
-	/* save the dump specific esp/eip */
-	__dump_save_panic_regs(dha);
+	int cpu = smp_processor_id();
 
-	/* one final check -- modify if we're in user mode */
-	if ((regs) && (!user_mode(regs))) {
-		dha->dha_regs.esp = (unsigned long) &(regs->esp);
-	}
+	dump_header_asm.dha_smp_num_cpus = smp_num_cpus;
+	dump_header_asm.dha_dumping_cpu = cpu;
+
+	save_this_cpu_state(cpu, regs, current);
+
+	save_other_cpu_states();
 
 	return (1);
 }
@@ -87,13 +180,28 @@
  *       case it's necessary in the future.
  */
 void
-__dump_open(struct file *dump_file, uint64_t memory_size)
+__dump_open(void)
 {
+	alloc_dha_stack();
 	/* return */
 	return;
 }
 
 /*
+ * Name: __dump_cleanup()
+ * Func: Free any architecture specific data structures. This is called
+ *       when the dump module is being removed.
+ */
+void
+__dump_cleanup(void)
+{
+	free_dha_stack();
+	/* return */
+	return;
+}
+
+#ifdef CONFIG_SMP
+/*
  * Non dumping cpus will spin here. If a cpu is handling an irq when ipi is
  * received, we let go of it here while making sure that it hits schedule
  * on the way up and make it spin there instead.
@@ -108,6 +216,7 @@
 	}
 	return;
 }
+#endif
 
 /*
  * Routine to save the old irq affinities and change affinities of all irqs to
@@ -179,4 +288,24 @@
 
 	/* return */
 	return (0);
+}
+
+/* located in arch/i386/kernel/traps.c */
+extern void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk);
+
+void 
+show_cpu_state(struct pt_regs * regs)
+{
+	int cpu = smp_processor_id();
+	int i;
+
+	__dump_configure_header(regs);	
+
+	printk("__dump_configure_header done from cpu %d\n", cpu);
+
+	for (i = 0; i < smp_num_cpus; i++) {
+		show_this_cpu_state(i, dump_header_asm.dha_smp_regs[i], dump_header_asm.dha_stack[i]);
+	}
+	
+	return;
 }
diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/dump.h lkcd_cvs_new/2.4/include/asm-i386/dump.h
--- lkcd_cvs_orig/2.4/include/asm-i386/dump.h	Wed Sep 26 15:21:38 2001
+++ lkcd_cvs_new/2.4/include/asm-i386/dump.h	Mon Nov 26 14:11:49 2001
@@ -14,6 +14,7 @@
 
 /* necessary header files */
 #include <asm/ptrace.h>                          /* for pt_regs             */
+#include <linux/threads.h>
 
 /* definitions */
 #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
@@ -45,6 +46,44 @@
 	/* the dump registers */
 	struct pt_regs       dha_regs;
 
+	/* smp specific */
+	uint32_t	     dha_smp_num_cpus;
+	int		     dha_dumping_cpu;	
+	struct pt_regs	     dha_smp_regs[NR_CPUS];
+	void *		     dha_smp_current_task[NR_CPUS];
+	void *		     dha_stack[NR_CPUS];
 } dump_header_asm_t;
+
+#ifdef __KERNEL__
+static inline void get_current_regs(struct pt_regs *regs)
+{
+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
+	regs->eip = (unsigned long)current_text_addr();
+	
+}
+
+extern volatile int dump_in_progress;
+extern unsigned long irq_affinity[];
+extern dump_header_asm_t dump_header_asm;
+
+#ifdef CONFIG_SMP
+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
+extern void dump_send_ipi(void);
+#else
+#define dump_send_ipi()
+#endif
+#endif /* __KERNEL__ */
 
 #endif /* _ASM_DUMP_H */
diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h
--- lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h	Thu Jan  1 05:30:00 1970
+++ lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h	Mon Nov 26 14:13:23 2001
@@ -0,0 +1,226 @@
+#ifndef _ASM_HW_IRQ_H
+#define _ASM_HW_IRQ_H
+
+/*
+ *	linux/include/asm/hw_irq.h
+ *
+ *	(C) 1992, 1993 Linus Torvalds, (C) 1997 Ingo Molnar
+ *
+ *	moved some of the old arch/i386/kernel/irq.h to here. VY
+ *
+ *	IRQ/IPI changes taken from work by Thomas Radke
+ *	<tomsoft@informatik.tu-chemnitz.de>
+ */
+
+#include <linux/config.h>
+#include <asm/atomic.h>
+#include <asm/irq.h>
+
+/*
+ * IDT vectors usable for external interrupt sources start
+ * at 0x20:
+ */
+#define FIRST_EXTERNAL_VECTOR	0x20
+
+#define SYSCALL_VECTOR		0x80
+
+/*
+ * Vectors 0x20-0x2f are used for ISA interrupts.
+ */
+
+/*
+ * Special IRQ vectors used by the SMP architecture, 0xf0-0xff
+ *
+ *  some of the following vectors are 'rare', they are merged
+ *  into a single vector (CALL_FUNCTION_VECTOR) to save vector space.
+ *  TLB, reschedule and local APIC vectors are performance-critical.
+ *
+ *  Vectors 0xf0-0xfa are free (reserved for future Linux use).
+ */
+#define SPURIOUS_APIC_VECTOR	0xff
+#define ERROR_APIC_VECTOR	0xfe
+#define INVALIDATE_TLB_VECTOR	0xfd
+#define RESCHEDULE_VECTOR	0xfc
+#define CALL_FUNCTION_VECTOR	0xfb
+#define DUMP_VECTOR		0xfa
+
+/*
+ * Local APIC timer IRQ vector is on a different priority level,
+ * to work around the 'lost local interrupt if more than 2 IRQ
+ * sources per level' errata.
+ */
+#define LOCAL_TIMER_VECTOR	0xef
+
+/*
+ * First APIC vector available to drivers: (vectors 0x30-0xee)
+ * we start at 0x31 to spread out vectors evenly between priority
+ * levels. (0x80 is the syscall vector)
+ */
+#define FIRST_DEVICE_VECTOR	0x31
+#define FIRST_SYSTEM_VECTOR	0xef
+
+extern int irq_vector[NR_IRQS];
+#define IO_APIC_VECTOR(irq)	irq_vector[irq]
+
+/*
+ * Various low-level irq details needed by irq.c, process.c,
+ * time.c, io_apic.c and smp.c
+ *
+ * Interrupt entry/exit code at both C and assembly level
+ */
+
+extern void mask_irq(unsigned int irq);
+extern void unmask_irq(unsigned int irq);
+extern void disable_8259A_irq(unsigned int irq);
+extern void enable_8259A_irq(unsigned int irq);
+extern int i8259A_irq_pending(unsigned int irq);
+extern void make_8259A_irq(unsigned int irq);
+extern void init_8259A(int aeoi);
+extern void FASTCALL(send_IPI_self(int vector));
+extern void init_VISWS_APIC_irqs(void);
+extern void setup_IO_APIC(void);
+extern void disable_IO_APIC(void);
+extern void print_IO_APIC(void);
+extern int IO_APIC_get_PCI_irq_vector(int bus, int slot, int fn);
+extern void send_IPI(int dest, int vector);
+
+extern unsigned long io_apic_irqs;
+
+extern atomic_t irq_err_count;
+extern atomic_t irq_mis_count;
+
+extern char _stext, _etext;
+
+#define IO_APIC_IRQ(x) (((x) >= 16) || ((1<<(x)) & io_apic_irqs))
+
+#define __STR(x) #x
+#define STR(x) __STR(x)
+
+#define SAVE_ALL \
+	"cld\n\t" \
+	"pushl %es\n\t" \
+	"pushl %ds\n\t" \
+	"pushl %eax\n\t" \
+	"pushl %ebp\n\t" \
+	"pushl %edi\n\t" \
+	"pushl %esi\n\t" \
+	"pushl %edx\n\t" \
+	"pushl %ecx\n\t" \
+	"pushl %ebx\n\t" \
+	"movl $" STR(__KERNEL_DS) ",%edx\n\t" \
+	"movl %edx,%ds\n\t" \
+	"movl %edx,%es\n\t"
+
+#define IRQ_NAME2(nr) nr##_interrupt(void)
+#define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr)
+
+#define GET_CURRENT \
+	"movl %esp, %ebx\n\t" \
+	"andl $-8192, %ebx\n\t"
+
+/*
+ *	SMP has a few special interrupts for IPI messages
+ */
+
+	/* there is a second layer of macro just to get the symbolic
+	   name for the vector evaluated. This change is for RTLinux */
+#define BUILD_SMP_INTERRUPT(x,v) XBUILD_SMP_INTERRUPT(x,v)
+#define XBUILD_SMP_INTERRUPT(x,v)\
+asmlinkage void x(void); \
+asmlinkage void call_##x(void); \
+__asm__( \
+"\n"__ALIGN_STR"\n" \
+SYMBOL_NAME_STR(x) ":\n\t" \
+	"pushl $"#v"\n\t" \
+	SAVE_ALL \
+	SYMBOL_NAME_STR(call_##x)":\n\t" \
+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
+	"jmp ret_from_intr\n");
+
+#define BUILD_SMP_TIMER_INTERRUPT(x,v) XBUILD_SMP_TIMER_INTERRUPT(x,v)
+#define XBUILD_SMP_TIMER_INTERRUPT(x,v) \
+asmlinkage void x(struct pt_regs * regs); \
+asmlinkage void call_##x(void); \
+__asm__( \
+"\n"__ALIGN_STR"\n" \
+SYMBOL_NAME_STR(x) ":\n\t" \
+	"pushl $"#v"\n\t" \
+	SAVE_ALL \
+	"movl %esp,%eax\n\t" \
+	"pushl %eax\n\t" \
+	SYMBOL_NAME_STR(call_##x)":\n\t" \
+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
+	"addl $4,%esp\n\t" \
+	"jmp ret_from_intr\n");
+
+#define BUILD_COMMON_IRQ() \
+asmlinkage void call_do_IRQ(void); \
+__asm__( \
+	"\n" __ALIGN_STR"\n" \
+	"common_interrupt:\n\t" \
+	SAVE_ALL \
+	"pushl $ret_from_intr\n\t" \
+	SYMBOL_NAME_STR(call_do_IRQ)":\n\t" \
+	"jmp "SYMBOL_NAME_STR(do_IRQ));
+
+/* 
+ * subtle. orig_eax is used by the signal code to distinct between
+ * system calls and interrupted 'random user-space'. Thus we have
+ * to put a negative value into orig_eax here. (the problem is that
+ * both system calls and IRQs want to have small integer numbers in
+ * orig_eax, and the syscall code has won the optimization conflict ;)
+ *
+ * Subtle as a pigs ear.  VY
+ */
+
+#define BUILD_IRQ(nr) \
+asmlinkage void IRQ_NAME(nr); \
+__asm__( \
+"\n"__ALIGN_STR"\n" \
+SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \
+	"pushl $"#nr"-256\n\t" \
+	"jmp common_interrupt");
+
+extern unsigned long prof_cpu_mask;
+extern unsigned int * prof_buffer;
+extern unsigned long prof_len;
+extern unsigned long prof_shift;
+
+/*
+ * x86 profiling function, SMP safe. We might want to do this in
+ * assembly totally?
+ */
+static inline void x86_do_profile (unsigned long eip)
+{
+	if (!prof_buffer)
+		return;
+
+	/*
+	 * Only measure the CPUs specified by /proc/irq/prof_cpu_mask.
+	 * (default is all CPUs.)
+	 */
+	if (!((1<<smp_processor_id()) & prof_cpu_mask))
+		return;
+
+	eip -= (unsigned long) &_stext;
+	eip >>= prof_shift;
+	/*
+	 * Don't ignore out-of-bounds EIP values silently,
+	 * put them into the last histogram slot, so if
+	 * present, they will show up as a sharp peak.
+	 */
+	if (eip > prof_len-1)
+		eip = prof_len-1;
+	atomic_inc((atomic_t *)&prof_buffer[eip]);
+}
+
+#ifdef CONFIG_SMP /*more of this file should probably be ifdefed SMP */
+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {
+	if (IO_APIC_IRQ(i))
+		send_IPI_self(IO_APIC_VECTOR(i));
+}
+#else
+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {}
+#endif
+
+#endif /* _ASM_HW_IRQ_H */
diff -urN -X dontdiff lkcd_cvs_orig/2.4/kernel/panic.c lkcd_cvs_new/2.4/kernel/panic.c
--- lkcd_cvs_orig/2.4/kernel/panic.c	Tue Oct 16 12:51:46 2001
+++ lkcd_cvs_new/2.4/kernel/panic.c	Mon Nov 26 17:33:33 2001
@@ -56,6 +56,10 @@
 #if defined(CONFIG_ARCH_S390)
         unsigned long caller = (unsigned long) __builtin_return_address(0);
 #endif
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	struct pt_regs regs;
+	get_current_regs(&regs);
+#endif
 
 	va_start(args, fmt);
 	vsprintf(buf, fmt, args);
@@ -78,7 +82,9 @@
 
 	notifier_call_chain(&panic_notifier_list, 0, NULL);
 
-	dump(buf, NULL);
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	dump(buf, &regs);
+#endif
 
 	if (panic_timeout > 0)
 	{
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile	Fri Jan 26 02:42:01 2001
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile	Tue Nov 27 13:08:26 2001
@@ -8,7 +8,7 @@
 include $(DEPTH)/commondefs
 
 TARGETS   = $(DEPTH)/libarch.a
-CFILES    = i386_cmds.c cmd_mktrace.c 
+CFILES    = i386_cmds.c cmd_mktrace.c cmd_rd.c cmd_defcpu.c
 OFILES    = $(CFILES:.c=.o)
 
 all: default
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Thu Jan  1 05:30:00 1970
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Tue Nov 27 13:30:01 2001
@@ -0,0 +1,88 @@
+#include <lcrash.h>
+
+extern int get_dump_header_asm(dump_header_asm_t *);
+
+int defcpu = -1;
+
+/*
+ * deftask_cmd() -- Run the 'deftask' command.
+ */
+int
+defcpu_cmd(command_t *cmd)
+{
+	dump_header_asm_t dha;
+	int cpu;
+
+	if (cmd->nargs == 0) {
+		if (defcpu == -1) {
+			fprintf(cmd->efp, "No default cpu set\n");
+		} else {
+			fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
+		}
+		return(0);
+	} 
+
+	if (MIP->core_type != reg_core) {
+		fprintf(cmd->efp, "Can't use this command on live system\n");
+		return (1);
+	}
+	if (get_dump_header_asm(&dha))
+		return (1);
+
+	cpu = strtol(cmd->args[0], NULL, 10);
+
+	if (cpu >= dha.dha_smp_num_cpus) {
+		fprintf(cmd->efp, "Error setting defcpu to %s\n", cmd->args[0]);
+		return (1);
+	}
+	defcpu = cpu;
+	fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
+
+	if (dha.dha_stack[defcpu]) {
+		deftask = (kaddr_t)dha.dha_smp_current_task[defcpu];
+		fprintf(cmd->ofp, "Default task is 0x%x\n", deftask);
+	}
+	return (0);
+}
+
+#define _DEFCPU_USAGE	"[-w outfile] [cpu]"
+
+/*
+ * defcpu_usage() -- Print the usage string for the 'defcpu' command.
+ */
+void
+defcpu_usage(command_t *cmd)
+{
+	CMD_USAGE(cmd, _DEFCPU_USAGE);
+}
+
+/*
+ * defcpu_help() -- Print the help information for the 'defcpu' command.
+ */
+void
+defcpu_help(command_t *cmd)
+{
+	CMD_HELP(cmd, _DEFCPU_USAGE,
+	"Set the default cpu if one is indicated. Otherwise print the "
+	"value of default cpu." 
+        "When 'lcrash' is run on a live system, defcpu has no "
+        "meaning.\n\n"
+	"This command also sets the default task to the task running "
+	"on the default cpu at the time the dump is taken. "
+	"The rd command will display the registers on the default cpu "
+	"at the time the dump is taken. "
+        "The trace command will display a trace wrt the task "
+        "running on the default cpu at the time the dump is taken. ");
+}
+
+/*
+ * defcpu_parse() -- Parse the command line arguments for 'defcpu'.
+ */
+int
+defcpu_parse(command_t *cmd)
+{
+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
+		return(1);
+	}
+	return(0);
+}
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Thu Jan  1 05:30:00 1970
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Mon Nov 26 16:43:31 2001
@@ -0,0 +1,65 @@
+#include <lcrash.h>
+
+extern int get_dump_header_asm(dump_header_asm_t *dump_header_asm);
+extern int defcpu;
+
+#define _RD_USAGE "[-w outfile]"
+
+void
+rd_usage(command_t *cmd)
+{
+	CMD_USAGE(cmd, _RD_USAGE);
+}
+
+void
+rd_help(command_t *cmd)
+{
+	CMD_HELP(cmd, _RD_USAGE,
+			"Display the register contents of the default cpu."
+			"This command can't be used on a live system ");
+}
+
+int
+rd_parse(command_t *cmd)
+{
+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
+		return(1);
+	}
+	return 0;
+}
+
+int
+rd_cmd(command_t *cmd)
+{
+	dump_header_asm_t dha;
+	struct pt_regs * regs;
+
+	if (cmd->nargs != 0) {
+		rd_usage(cmd);
+		return(1);
+	}	
+
+	if (MIP->core_type != reg_core) {
+		fprintf(cmd->efp, "Can't use this command on live system\n");
+		return(1);
+	}
+	
+	if (get_dump_header_asm(&dha))
+		return(1);
+
+	if (defcpu == -1)
+		defcpu = dha.dha_dumping_cpu;
+	
+	regs = &dha.dha_smp_regs[defcpu];
+
+	fprintf(cmd->ofp, "CPU:    %d   EIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
+		defcpu, regs->xcs & 0xffff, regs->eip, regs->eflags);
+	fprintf(cmd->ofp, "eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
+		regs->eax, regs->ebx, regs->ecx, regs->edx);
+	fprintf(cmd->ofp, "esi: %08lx   edi: %08lx   ebp: %08lx   esp: %08lx\n",
+		regs->esi, regs->edi, regs->ebp, regs->esp);
+	fprintf(cmd->ofp, "ds: %04x   es: %04x   ss: %04x\n",
+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
+
+	return(0);
+}
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Fri Nov 17 05:06:51 2000
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Tue Nov 27 13:07:45 2001
@@ -6,8 +6,16 @@
 extern int mktrace_cmd(command_t *), mktrace_parse(command_t *);
 extern void mktrace_help(command_t *), mktrace_usage(command_t *);
 
+extern int rd_cmd(command_t *), rd_parse(command_t *);
+extern void rd_help(command_t *), rd_usage(command_t *);
+
+extern int defcpu_cmd(command_t *), defcpu_parse(command_t *);
+extern void defcpu_help(command_t *), defcpu_usage(command_t *);
+
 _command_t i386_cmdset[] = {
 	{"mktrace", 0, mktrace_cmd, mktrace_parse, mktrace_help, mktrace_usage},
 	{"mt", "mktrace" },
+	{"rd", 0, rd_cmd, rd_parse, rd_help, rd_usage},
+	{"defcpu", 0, defcpu_cmd, defcpu_parse, defcpu_help, defcpu_usage},
 	{(char *)0 }
 };
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c	Tue Jul  3 19:37:36 2001
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c	Mon Nov 26 13:22:32 2001
@@ -741,9 +741,9 @@
 		return(1);
 	} else {
 		saddr = kl_kernelstack(task);
-		if (task == kl_dumptask()) {
-			eip = kl_dumpeip();
-			esp = kl_dumpesp();
+		if (kl_smp_dumptask(task)) {
+			eip = kl_dumpeip(task);
+			esp = kl_dumpesp(task);
 		} else {
 			if (LINUX_2_2_X(KL_LINUX_RELEASE)) {
 				eip = KL_UINT(K_PTR(tsp, "task_struct", "tss"), 
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c
--- lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c	Thu Oct 12 02:32:54 2000
+++ lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c	Mon Nov 26 13:11:08 2001
@@ -9,7 +9,7 @@
 /*
  * get_dump_header()
  */
-static int
+int
 get_dump_header(dump_header_t *dump_header)
 {
 	/* first, make sure this isn't a live system
@@ -42,7 +42,7 @@
 /*
  * get_dump_header_asm()
  */
-static int
+int
 get_dump_header_asm(dump_header_asm_t *dump_header_asm)
 {
 	dump_header_t dump_header;
@@ -90,36 +90,40 @@
  * kl_dumpesp()
  */
 kaddr_t
-kl_dumpesp(void)
+kl_dumpesp(kaddr_t tsk)
 {
-	dump_header_asm_t dump_header_asm;
+	dump_header_asm_t dha;
+	int i;
 
-	if (get_dump_header_asm(&dump_header_asm)) {
+	if (get_dump_header_asm(&dha)) {
 		return((kaddr_t)NULL);
 	}
-	if (dump_header_asm.dha_regs.esp) {
-		return((kaddr_t)dump_header_asm.dha_regs.esp);
-	} else { 
-		return((kaddr_t)dump_header_asm.dha_esp);
+	
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (tsk == dha.dha_smp_current_task[i])
+			return (dha.dha_smp_regs[i].esp);
 	}
+	return((kaddr_t)NULL);
 }
 
 /*
  * kl_dumpeip()
  */
 kaddr_t
-kl_dumpeip(void)
+kl_dumpeip(kaddr_t tsk)
 {
-	dump_header_asm_t dump_header_asm;
+	dump_header_asm_t dha;
+	int i;
 
-	if (get_dump_header_asm(&dump_header_asm)) {
+	if (get_dump_header_asm(&dha)) {
 		return((kaddr_t)NULL);
 	}
-	if (dump_header_asm.dha_regs.eip) {
-		return((kaddr_t)dump_header_asm.dha_regs.eip);
-	} else { 
-		return((kaddr_t)dump_header_asm.dha_eip);
+	
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (tsk == dha.dha_smp_current_task[i])
+			return (dha.dha_smp_regs[i].eip);
 	}
+	return((kaddr_t)NULL);
 }
 
 /*
@@ -134,5 +138,23 @@
 		return((kaddr_t)NULL);
 	}
 	return((kaddr_t)dump_header.dh_current_task);
+	
+}
+
+int
+kl_smp_dumptask(kaddr_t tsk)
+{
+	dump_header_asm_t dha;
+	int i;
+
+	if (get_dump_header_asm(&dha)) {
+		return((kaddr_t)NULL);
+	}
+	
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (dha.dha_smp_regs[i].eip > KL_PAGE_OFFSET && tsk == dha.dha_smp_current_task[i])
+			return (1);
+	}
+	return (0);	
 }
 
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h
--- lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h	Wed Sep  5 13:38:00 2001
+++ lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h	Mon Nov 26 15:41:09 2001
@@ -4,7 +4,8 @@
  * Created by: Matt Robinson (yakker@sgi.com)
  *
  * Copyright 1999 Silicon Graphics, Inc. All rights reserved.
- * 
+ *
+ * This code is released under version 2 of the GNU GPL.
  */
 
 /* This header file holds the architecture specific crash dump header */
@@ -13,6 +14,7 @@
 
 /* necessary header files */
 #include <asm/ptrace.h>                          /* for pt_regs             */
+#include <linux/threads.h>
 
 /* definitions */
 #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
@@ -44,17 +46,44 @@
 	/* the dump registers */
 	struct pt_regs       dha_regs;
 
+	/* smp specific */
+	uint32_t	     dha_smp_num_cpus;
+	int		     dha_dumping_cpu;	
+	struct pt_regs	     dha_smp_regs[NR_CPUS];
+	void *		     dha_smp_current_task[NR_CPUS];
+	void *		     dha_stack[NR_CPUS];
 } dump_header_asm_t;
 
 #ifdef __KERNEL__
-extern void __dump_open(struct file *, uint64_t);
-extern void __dump_init(uint64_t);
-extern void __dump_silence_system(void);
-extern void __dump_resume_system(void);
-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
-#ifdef CONFIG_X86
-extern void __dump_save_panic_regs(dump_header_asm_t *);
-#endif
+static inline void get_current_regs(struct pt_regs *regs)
+{
+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
+	regs->eip = (unsigned long)current_text_addr();
+	
+}
+
+extern volatile int dump_in_progress;
+extern unsigned long irq_affinity[];
+extern dump_header_asm_t dump_header_asm;
+
+#ifdef CONFIG_SMP
+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
+extern void dump_send_ipi(void);
+#else
+#define dump_send_ipi()
 #endif
+#endif /* __KERNEL__ */
 
 #endif /* _ASM_DUMP_H */
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h
--- lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h	Thu Oct 12 02:32:54 2000
+++ lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h	Mon Nov 26 13:17:16 2001
@@ -9,7 +9,8 @@
 int kl_parent_pid(void *);
 kaddr_t kl_pid_to_task(kaddr_t);
 k_error_t kl_get_task_struct(kaddr_t, int, void *);
-kaddr_t kl_dumpeip(void);
-kaddr_t kl_dumpesp(void);
+kaddr_t kl_dumpeip(kaddr_t tsk);
+kaddr_t kl_dumpesp(kaddr_t tsk);
+int kl_smp_dumptask(kaddr_t tsk);
 kaddr_t kl_dumptask(void);
 kaddr_t kl_kernelstack(kaddr_t);
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c lkcd_cvs_new/lkcdutils/libklib/kl_memory.c
--- lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c	Fri Nov 23 17:25:35 2001
+++ lkcd_cvs_new/lkcdutils/libklib/kl_memory.c	Mon Nov 26 13:15:58 2001
@@ -123,6 +123,34 @@
 	return((meminfo_t *)NULL);
 }
 
+extern int get_dump_header_asm(dump_header_asm_t *dha);
+kaddr_t
+__kl_fix_vaddr(kaddr_t vaddr, size_t sz)
+{
+	dump_header_asm_t dha;
+	kaddr_t cur_task;
+	int i;
+
+	if (MIP->core_type != reg_core) {
+		return vaddr;
+	}
+	if (get_dump_header_asm(&dha))
+		return vaddr;
+
+	/* this is a very simplistic check to see if we have saved 
+	 * (snapshotted) this particular block. This is very limited 
+	 * to finding the saved task structs only.
+	 */
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (dha.dha_smp_regs[i].eip < KL_PAGE_OFFSET)
+			continue; /* if task is in user space, no need to look at saved stack */
+		cur_task = dha.dha_smp_current_task[i];
+		if (vaddr >= cur_task && vaddr + sz <  cur_task + KSTACK_SIZE)
+			return (dha.dha_stack[i] + (vaddr - cur_task));
+	}
+	return vaddr;
+}
+
 /*
  * get_block()
  * 
@@ -142,13 +170,16 @@
 		KL_ERROR = KLE_ZERO_SIZE;
 	} else {
 		while (size > 0){
+			kaddr_t tmp = vaddr;
 			s=((vaddr & KL_PAGE_MASK) | (~KL_PAGE_MASK)) - 
 				vaddr + 1;
 			s= (size > s) ? s : size;
+			vaddr = __kl_fix_vaddr(vaddr, s);	
 			if ( kl_virtop(vaddr, mmap, &paddr) ) {
 				return(KL_ERROR);
 			}
 			kl_readmem(paddr, s, bp);
+			vaddr = tmp;
 			size=size - s;
 			vaddr=vaddr + s;
 			bp=bp + s;
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c
--- lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c	Fri Nov 23 17:25:37 2001
+++ lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c	Mon Nov 26 16:35:23 2001
@@ -242,7 +242,7 @@
 
 	/* set dump compression */
 	if (compress_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)&compress)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)compress)) < 0) {
 			perror("ioctl() for dump compression failed");
 			close(dfd);
 			return (err);
@@ -251,7 +251,7 @@
 
 	/* set dump flags */
 	if (flags_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)&flags)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)flags)) < 0) {
 			perror("ioctl() for dump flags failed");
 			close(dfd);
 			return (err);
@@ -260,7 +260,7 @@
 
 	/* set dump level */
 	if (level_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)&level)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)level)) < 0) {
 			perror("ioctl() for dump level failed");
 			close(dfd);
 			return (err);


From lkcd-general-owner@lists.sourceforge.net Tue Nov 27 00:34:17 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 168dgz-0008IZ-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 27 Nov 2001 00:34:13 -0800
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAR9Y3o11297
	for <lkcd@oss.sgi.com>; Tue, 27 Nov 2001 01:34:04 -0800
Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.99.140.24])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id DAA68354;
	Tue, 27 Nov 2001 03:31:18 -0500
Received: from vamsiks.in.ibm.com (vamsiks.in.ibm.com [9.186.133.18])
	by westrelay03.boulder.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fAR8XgY72130;
	Tue, 27 Nov 2001 01:33:43 -0700
Received: (from vamsi@localhost)
	by vamsiks.in.ibm.com (8.11.2/8.11.2) id fAR90J008368;
	Tue, 27 Nov 2001 14:30:19 +0530
From: "Vamsi Krishna S ." <vamsi@in.ibm.com>
To: lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net
Cc: bharata <bharata@in.ibm.com>, suparna <bsuparna@in.ibm.com>,
   subodh <subodh@in.ibm.com>
Message-ID: <20011127143019.A8322@in.ibm.com>
Reply-To: vamsi@in.ibm.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Subject: [lkcd-general] [PATCH]capturing registers/stack on all processors
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 27 00:35:03 2001
X-Original-Date: Tue, 27 Nov 2001 14:30:19 +0530

Hello, 

Here is a patch against lkcd cvs (as on 11/26/2001) for capturing 
registers on all processors at the time of dumping. 

This has been found to be crucial to debug problems where some of 
the cpus on an SMP are hung (executing a tight loop, interrupts 
disabled).

We send an NMI-class IPI to other cpus to capture the registers
and stack. This is the only guaranteed way to ensure that other 
cpus respond. If they don't respond to NMI, there is absolutely 
nothing we can do in software. 

We need to capture the stack, even though we would prefer not to. 
The reason being that the stack could change between the time the 
registers are captured and the time that page is written out in 
the dumping process. The chages in the stack could be so 
significant as to render backtracing impossible/totally inaccurate. 

Currently, all the changes we made are specific to i386, even
though many of the changes could have been arch-independent. 

Brief list of chages:

kernel:
- extensions to dump_header_asm_t to add fields to capture:
	- smp_num_cpus and dumping_cpu
	- registers of all processors
	- pointers to current tasks
	- pointers to the location where stacks are saved
- remove __dump_save_panic_regs
- collect registers in panic()
- remove all use of dha_esp, dha_eip, dha_regs and use 
  dha_smp_regs consistantly
- cleanup dump_configure_header handling, ie, do it only
  once in dump_execute
- send NMI to all processors and capture their registers,
  current task and kernel stack as part of
  __configure_dump_header
- [bonus] new magic sysrq key 'd' to show the registers
  and, backtrace if inside kernel, on all processors
- [side effect] as part of capturing registers on panic
  we now seem to be able to backtrace correctly in 
  panic dump cases.

lcrash:
- new commands
	- rd
	- defcpu
- rd to display registers captured at the time of taking the
  dump on the processor which is currently the defcpu
- defcpu to set the default cpu and set deftask to the current
  task on that cpu at the time of dump
- new kl_smp_dumptask to determine while backtracing if this
  task is a current task on any of the processors at the time
  of dump
- changes to kl_dumpesp/kl_dumpeip to get the esp/eip values
  from dha_smp_regs.
- changes to get_block() to look at the saved stack if this
  task is a current task on any of the processors and was 
  inside the kernel when the dump was taken
- [unrelated bug fix] fix lkcd_config.c to pass the values of
  dump level, dump flags and compression_type instead of their
  addresses to the ioctl call to set them.


-- 
LKCD Team India
Linux Technology Center,
IBM Software Lab, Bangalore.
Ph: +91 80 5044959
Internet: vamsi@in.ibm.com

--

diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c
--- lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c	Mon Sep 24 15:31:42 2001
+++ lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c	Mon Nov 26 14:03:33 2001
@@ -31,8 +31,6 @@
 
 extern void dump_thread(struct pt_regs *, struct user *);
 extern spinlock_t rtc_lock;
-extern irq_desc_t irq_desc[];
-extern unsigned long irq_affinity[];
 
 #if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
 extern void machine_real_restart(unsigned char *, int);
@@ -150,8 +148,6 @@
 #endif
 
 EXPORT_SYMBOL(get_wchan);
-EXPORT_SYMBOL(irq_affinity);
-EXPORT_SYMBOL(irq_desc);
 
 EXPORT_SYMBOL(rtc_lock);
 
@@ -164,4 +160,17 @@
 
 #ifdef CONFIG_X86_PAE
 EXPORT_SYMBOL(empty_zero_page);
+#endif
+
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+extern irq_desc_t irq_desc[];
+extern unsigned long irq_affinity[];
+EXPORT_SYMBOL(irq_affinity);
+EXPORT_SYMBOL(irq_desc);
+#ifdef CONFIG_SMP
+extern void dump_send_ipi(void);
+EXPORT_SYMBOL(dump_send_ipi);
+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
+EXPORT_SYMBOL(dump_ipi_function_ptr);
+#endif
 #endif
diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c lkcd_cvs_new/2.4/arch/i386/kernel/smp.c
--- lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c	Tue Oct 16 12:51:44 2001
+++ lkcd_cvs_new/2.4/arch/i386/kernel/smp.c	Mon Nov 26 14:06:02 2001
@@ -142,6 +142,15 @@
 	 */
 	cfg = __prepare_ICR(shortcut, vector);
 
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	if (vector == DUMP_VECTOR) {
+		/*
+		 * Setup DUMP IPI to be delivered as an NMI
+		 */
+		cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
+	}
+#endif	/* CONFIG_DUMP */
+
 	/*
 	 * Send the IPI. The write to APIC_ICR fires this off.
 	 */
@@ -424,6 +433,13 @@
 
 	do_flush_tlb_all_local();
 }
+
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+void dump_send_ipi(void)
+{
+	send_IPI_allbutself(DUMP_VECTOR);
+}
+#endif	
 
 /*
  * this function sends a 'reschedule' IPI to another CPU.
diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c lkcd_cvs_new/2.4/arch/i386/kernel/traps.c
--- lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c	Wed Sep 26 15:16:15 2001
+++ lkcd_cvs_new/2.4/arch/i386/kernel/traps.c	Mon Nov 26 16:46:48 2001
@@ -89,6 +89,105 @@
 
 int kstack_depth_to_print = 24;
 
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+/* 
+ * This code mimics show_trace() etc in arch/i386/kernel/traps.c. We don't 
+ * use them directly as they depend on 8K aligned kernel stacks that our
+ * saved stacks don't satisfy. However, there is move to relax the requirement
+ * on task_struct to be 8K-aligned. Once that happens, we could simpify this
+ * function.
+ */
+void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk)
+{
+	int i;
+	unsigned long *esp;
+	unsigned char *c;
+	int in_kernel = 1;
+
+	esp = (unsigned long *)regs->esp;
+	c = (unsigned char *)regs->eip;
+
+	if (regs->xcs & 3) {
+		in_kernel = 0;
+	}
+	printk("CPU:    %d\nEIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
+		cpu, 0xffff & regs->xcs, regs->eip, regs->eflags);
+	printk("eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
+		regs->eax, regs->ebx, regs->ecx, regs->edx);
+	printk("esi: %08lx   edi: %08lx   ebp: %08lx   esp: %p\n",
+		regs->esi, regs->edi, regs->ebp, esp);
+	printk("ds: %04x   es: %04x   ss: %04x\n",
+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
+	if (!tsk) {
+		printk("no stack for this cpu\n");
+		return;
+	}
+	printk("Process %s (pid: %d, stackpage=%08lx)",
+		tsk->comm, tsk->pid, 4096+(regs->esp & ~(THREAD_SIZE-1)));
+	/*
+	 * When in-kernel, we also print out the stack and code at the
+	 * time of the fault..
+	 */
+	if (in_kernel) {
+		unsigned long *stack;
+		unsigned long addr, module_start, module_end;
+		extern char _stext, _etext;
+
+		extern int kstack_depth_to_print;
+
+		esp = (unsigned long *)((unsigned long)tsk + (regs->esp & (THREAD_SIZE-1)));
+
+		printk("\nStack: ");
+		stack = esp;
+		for(i=0; i < kstack_depth_to_print; i++) {
+			if ((unsigned long)stack > (unsigned long)tsk + THREAD_SIZE-1)
+				break;
+			if (i && ((i % 8) == 0))
+				printk("\n       ");
+			printk("%08lx ", *stack++);
+		}
+		
+		printk("\nCall Trace: ");
+		i = 1;
+		stack = esp;
+		module_start = VMALLOC_START;
+		module_end = VMALLOC_END;
+		module_end = 0;
+		while ((unsigned long)stack < (unsigned long)tsk + THREAD_SIZE) {
+			addr = *stack++;
+			/*
+			 * If the address is either in the text segment of the
+			 * kernel, or in the region which contains vmalloc'ed
+			 * memory, it *may* be the address of a calling
+			 * routine; if so, print it so that someone tracing
+			 * down the cause of the crash will be able to figure
+			 * out the call path that was taken.
+			 */
+			if (((addr >= (unsigned long) &_stext) &&
+			     (addr <= (unsigned long) &_etext)) ||
+			    ((addr >= module_start) && (addr <= module_end))) {
+				if (i && ((i % 8) == 0))
+					printk("\n       ");
+				printk("[<%08lx>] ", addr);
+				i++;
+			}
+		}
+		printk("\n");
+
+		printk("\nCode: ");
+		if(regs->eip < PAGE_OFFSET) {
+			printk("eip in user space. error.\n");
+		}
+
+		for(i=0;i<20;i++) {
+			printk("%02x ", *c++);
+		}
+	}
+	printk("\n");
+	return;
+}	
+#endif /* CONFIG_DUMP */
+
 /*
  * These constants are for searching for possible module text
  * segments.
@@ -471,12 +570,33 @@
 }
 #endif
 
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+#ifdef CONFIG_SMP
+int (*dump_ipi_function_ptr)(struct pt_regs *) = NULL;
+static int dump_ipi(struct pt_regs *regs)
+{
+	if (!(dump_ipi_function_ptr && dump_ipi_function_ptr(regs))) {
+		return 0;
+	}
+	ack_APIC_irq();
+	return 1;
+}	
+#else
+#define dump_ipi(regs) 0
+#endif
+#endif
+
 asmlinkage void do_nmi(struct pt_regs * regs, long error_code)
 {
 	unsigned char reason = inb(0x61);
 
 
 	++nmi_count(smp_processor_id());
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	if (dump_ipi(regs)) {
+		return;
+	}
+#endif
 	if (!(reason & 0xc0)) {
 #if CONFIG_X86_IO_APIC
 		/*
diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/char/sysrq.c lkcd_cvs_new/2.4/drivers/char/sysrq.c
--- lkcd_cvs_orig/2.4/drivers/char/sysrq.c	Fri Nov 23 17:25:29 2001
+++ lkcd_cvs_new/2.4/drivers/char/sysrq.c	Tue Nov 27 14:01:47 2001
@@ -96,6 +96,15 @@
 		dump("sysrq", pt_regs);
 		break;
 #endif
+#if defined(CONFIG_DUMP)
+	case 'd':
+		{
+		extern void show_cpu_state(struct pt_regs *);
+		printk("Show state of all cpus\n");
+		show_cpu_state(pt_regs);
+		break;
+		}
+#endif
 
 	case 'o':					    /* O -- power off */
 		if (sysrq_power_off) {
diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_base.c lkcd_cvs_new/2.4/drivers/dump/dump_base.c
--- lkcd_cvs_orig/2.4/drivers/dump/dump_base.c	Fri Nov 23 17:25:30 2001
+++ lkcd_cvs_new/2.4/drivers/dump/dump_base.c	Mon Nov 26 16:34:44 2001
@@ -268,14 +268,12 @@
 extern struct new_utsname system_utsname;     /* system information        */
 
 /* external architecture-specific functions */
-extern void __dump_open(struct file *, uint64_t);
+extern void __dump_open(void);
+extern void __dump_cleanup(void);
 extern void __dump_init(uint64_t);
-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
+extern int __dump_configure_header(struct pt_regs *);
 extern unsigned int  __dump_silence_system(unsigned int);
 extern unsigned int  __dump_resume_system(unsigned int);
-#ifdef CONFIG_X86
-extern void __dump_save_panic_regs(dump_header_asm_t *);
-#endif
 
 /* external functions                                                      */
 extern void si_meminfo(struct sysinfo *);
@@ -736,7 +734,7 @@
 	}
 
 	/* configure architecture-specific dump header values */
-	if (!__dump_configure_header(&dump_header_asm, regs)) {
+	if (!__dump_configure_header(regs)) {
 		return (0);
 	}
 	return (1);
@@ -792,16 +790,11 @@
  *       memory pages and dumps the data to disk (using other functions).
  */
 static int
-dump_execute_memdump(char *panic_str, struct pt_regs *regs)
+dump_execute_memdump(void)
 {
 	int counter = 0, state = 0;
 	unsigned long mem_loc, buf_loc;
 
-	if (!dump_configure_header(panic_str, regs)) {
-		DUMP_PRINT("Dump header could not be configured!");
-		return (-1);
-	}
-
 	DUMP_PRINT("\nDump compression value is 0x%x ...", dump_compress);
 
 	DUMP_PRINT("\nWriting dump header ...");
@@ -939,39 +932,23 @@
 		return;
 	}
 
+	if(!dump_configure_header(panic_str, regs)) {
+		DUMP_PRINT("\ndump header could not be configured!");
+		return;
+	}
+
 	/* silence the system */
 	dump_silence_system();
 
 	/* bail out if we're not going to do any dumping */
 	if (dump_level != DUMP_LEVEL_NONE) {
 		/* inform users of what we are about to do */
-#ifdef CONFIG_SMP
 		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
 			dump_device, bdevname(dump_device),
 			smp_processor_id());
-#else
-		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
-			dump_device, bdevname(dump_device),
-			0);
-#endif
 
 		/* start walking through the page tables */
-		state = dump_execute_memdump(panic_str, regs);
-
-#ifdef CONFIG_X86
-		/*
-		 * Okay, this is REALLY annoying to have to
-		 * do.  What this means is that for x86
-		 * systems, we have to literally save the
-		 * esp/eip _now_, because we don't want the
-		 * esp/eip from dump_write_header() or
-		 * anything it calls to conflict with
-		 * re-building the panic() stack trace case.
-		 * So for that reason, we save the eip/esp
-		 * now so we can re-build the trace later.
-		 */
-		__dump_save_panic_regs(&dump_header_asm);
-#endif
+		state = dump_execute_memdump();
 
 		/* update header to disk for the last time */
 		if (dump_write_header() < 0) {
@@ -1054,7 +1031,7 @@
 	struct list_head *tmp;
 	dump_compress_t *dc;
 
-	/* try to remove the compression item */
+	/* try to set the compression type*/
 	list_for_each(tmp, &dump_compress_list) {
 		dc = list_entry(tmp, dump_compress_t, list);
 		if (dc->compress_type == compression_type) {
@@ -1210,6 +1187,7 @@
 			if (!(f->f_flags & O_RDWR)) {
 				return (-EPERM);
 			}
+			__dump_open();
 			return (dump_open_kdev((kdev_t)arg));
 
 		/* get dump_device */
@@ -1423,6 +1401,9 @@
 	if (dump_page_buf) {
 		kfree((const void *)dump_page_buf);
 	}
+
+	/* arch-specific cleanup routine */
+	__dump_cleanup();
 
 	/* remove the proc entries */
 	dump_proc_cleanup();
diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c lkcd_cvs_new/2.4/drivers/dump/dump_i386.c
--- lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c	Thu Oct  4 14:19:49 2001
+++ lkcd_cvs_new/2.4/drivers/dump/dump_i386.c	Tue Nov 27 14:15:26 2001
@@ -21,50 +21,143 @@
 #include <linux/kernel.h>
 #include <linux/smp.h>
 #include <linux/fs.h>
+#include <linux/vmalloc.h>
 #include <linux/dump.h>
 #include <linux/mm.h>
 #include <asm/processor.h>
 #include <asm/hardirq.h>
 #include <linux/irq.h>
 
-extern volatile int dump_in_progress;
-extern unsigned long irq_affinity[NR_IRQS];
 static unsigned long saved_affinity[NR_IRQS];
 
-/*
- * Name: __dump_save_panic_regs()
- * Func: Save the EIP (really the RA).  We may pass an argument later.
- * 	 Save ESP also here. 
- */
-inline void 
-__dump_save_panic_regs(dump_header_asm_t *dha)
-{
-	__asm__ __volatile__("movl  %%esp, %0\n"
-		: "=r" (dha->dha_esp));
-	/* hate to do this, but ... */
-#ifdef CONFIG_FRAME_POINTER
-	__asm__ __volatile__("movl  4(%%esp), %0\n"
-		: "=r" (dha->dha_eip));
+static int alloc_dha_stack(void)
+{
+	int i;
+	void *ptr;
+	
+	if (dump_header_asm.dha_stack[0])
+		return 0;
+
+       	ptr = vmalloc(THREAD_SIZE * smp_num_cpus);
+	if (!ptr) {
+		printk("vmalloc for dha_stacks failed\n");
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < smp_num_cpus; i++) {
+		dump_header_asm.dha_stack[i] = (void *)((unsigned long)ptr + (i * THREAD_SIZE));
+	}
+	return 0;
+}
+
+static int free_dha_stack(void) 
+{
+	if (dump_header_asm.dha_stack[0])
+		vfree(dump_header_asm.dha_stack[0]);	
+	return 0;
+}
+
+/* In case of panic dumps, we collects regs on entry to panic.
+ * so, we shouldn't 'fix' ssesp here again. But it is hard to
+ * tell just looking at regs whether ssesp need fixing. We make
+ * this decision by looking at xss in regs. If we have better
+ * means to determine that ssesp are valid (by some flag which
+ * tells that we are here due to panic dump), then we can use
+ * that instead of this kludge.
+ */
+static inline void 
+fix_ssesp(struct pt_regs *regs, int cpu)
+{
+	if (!user_mode(regs)) {
+		if ((cpu == dump_header_asm.dha_dumping_cpu) &&
+			regs->xss == __KERNEL_DS) 
+			return;
+		dump_header_asm.dha_smp_regs[cpu].esp = 
+				(unsigned long)&(regs->esp);
+		__asm__ __volatile__ ("movw %%ss, %%ax;"
+			:"=a"(dump_header_asm.dha_smp_regs[cpu].xss));
+	}
+}
+
+static void
+save_this_cpu_state(int cpu, struct pt_regs *regs, struct task_struct *tsk)
+{
+	dump_header_asm.dha_smp_regs[cpu] = *regs;
+	dump_header_asm.dha_smp_current_task[cpu] = tsk;
+	fix_ssesp(regs, cpu);
+
+	if (dump_header_asm.dha_stack[cpu]) {
+		memcpy(dump_header_asm.dha_stack[cpu], tsk, THREAD_SIZE);
+	}
+	return;
+}
+
+#ifdef CONFIG_SMP
+static int dump_expect_ipi[NR_CPUS];
+static atomic_t waiting_for_dump_ipi;
+static int wait_for_dump_ipi = 1; /* always wait for ipi to to be handled */
+
+static int
+dump_ipi_handler(struct pt_regs *regs) 
+{
+	int cpu = smp_processor_id();
+
+	if (!dump_expect_ipi[cpu]) {
+		return 0;
+	}
+	
+	save_this_cpu_state(cpu, regs, current);
+
+	dump_expect_ipi[cpu] = 0;
+	atomic_dec(&waiting_for_dump_ipi);
+	return 1;
+}
+
+/* save registers on other processors */
+void 
+save_other_cpu_states(void)
+{
+	int i;
+
+	if (smp_num_cpus > 1) {
+		atomic_set(&waiting_for_dump_ipi, smp_num_cpus-1);
+		for (i = 0; i < NR_CPUS; i++)
+			dump_expect_ipi[i] = 1;
+		
+		dump_ipi_function_ptr = dump_ipi_handler;
+		dump_send_ipi();
+		/* may be we dont need to wait for NMI to be processed. 
+		   just write out the header at the end of dumping, if
+		   this IPI is not processed untill then, there probably
+		   is a problem and we just fail to capture state of 
+		   other cpus. */
+		if (wait_for_dump_ipi) {
+			while(atomic_read(&waiting_for_dump_ipi))
+				barrier();
+			dump_ipi_function_ptr = NULL;
+		}
+	}
+	return;
+}
 #else
-	__asm__ __volatile__("movl  (%%esp), %0\n"
-		: "=r" (dha->dha_eip));
+#define save_other_cpu_states()
 #endif
-}
 
 /*
  * Name: __dump_configure_header()
  * Func: Configure the dump header with all proper values.
  */
 int
-__dump_configure_header(dump_header_asm_t *dha, struct pt_regs *regs)
+__dump_configure_header(struct pt_regs *regs)
 {
-	/* save the dump specific esp/eip */
-	__dump_save_panic_regs(dha);
+	int cpu = smp_processor_id();
 
-	/* one final check -- modify if we're in user mode */
-	if ((regs) && (!user_mode(regs))) {
-		dha->dha_regs.esp = (unsigned long) &(regs->esp);
-	}
+	dump_header_asm.dha_smp_num_cpus = smp_num_cpus;
+	dump_header_asm.dha_dumping_cpu = cpu;
+
+	save_this_cpu_state(cpu, regs, current);
+
+	save_other_cpu_states();
 
 	return (1);
 }
@@ -87,13 +180,28 @@
  *       case it's necessary in the future.
  */
 void
-__dump_open(struct file *dump_file, uint64_t memory_size)
+__dump_open(void)
 {
+	alloc_dha_stack();
 	/* return */
 	return;
 }
 
 /*
+ * Name: __dump_cleanup()
+ * Func: Free any architecture specific data structures. This is called
+ *       when the dump module is being removed.
+ */
+void
+__dump_cleanup(void)
+{
+	free_dha_stack();
+	/* return */
+	return;
+}
+
+#ifdef CONFIG_SMP
+/*
  * Non dumping cpus will spin here. If a cpu is handling an irq when ipi is
  * received, we let go of it here while making sure that it hits schedule
  * on the way up and make it spin there instead.
@@ -108,6 +216,7 @@
 	}
 	return;
 }
+#endif
 
 /*
  * Routine to save the old irq affinities and change affinities of all irqs to
@@ -179,4 +288,24 @@
 
 	/* return */
 	return (0);
+}
+
+/* located in arch/i386/kernel/traps.c */
+extern void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk);
+
+void 
+show_cpu_state(struct pt_regs * regs)
+{
+	int cpu = smp_processor_id();
+	int i;
+
+	__dump_configure_header(regs);	
+
+	printk("__dump_configure_header done from cpu %d\n", cpu);
+
+	for (i = 0; i < smp_num_cpus; i++) {
+		show_this_cpu_state(i, dump_header_asm.dha_smp_regs[i], dump_header_asm.dha_stack[i]);
+	}
+	
+	return;
 }
diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/dump.h lkcd_cvs_new/2.4/include/asm-i386/dump.h
--- lkcd_cvs_orig/2.4/include/asm-i386/dump.h	Wed Sep 26 15:21:38 2001
+++ lkcd_cvs_new/2.4/include/asm-i386/dump.h	Mon Nov 26 14:11:49 2001
@@ -14,6 +14,7 @@
 
 /* necessary header files */
 #include <asm/ptrace.h>                          /* for pt_regs             */
+#include <linux/threads.h>
 
 /* definitions */
 #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
@@ -45,6 +46,44 @@
 	/* the dump registers */
 	struct pt_regs       dha_regs;
 
+	/* smp specific */
+	uint32_t	     dha_smp_num_cpus;
+	int		     dha_dumping_cpu;	
+	struct pt_regs	     dha_smp_regs[NR_CPUS];
+	void *		     dha_smp_current_task[NR_CPUS];
+	void *		     dha_stack[NR_CPUS];
 } dump_header_asm_t;
+
+#ifdef __KERNEL__
+static inline void get_current_regs(struct pt_regs *regs)
+{
+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
+	regs->eip = (unsigned long)current_text_addr();
+	
+}
+
+extern volatile int dump_in_progress;
+extern unsigned long irq_affinity[];
+extern dump_header_asm_t dump_header_asm;
+
+#ifdef CONFIG_SMP
+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
+extern void dump_send_ipi(void);
+#else
+#define dump_send_ipi()
+#endif
+#endif /* __KERNEL__ */
 
 #endif /* _ASM_DUMP_H */
diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h
--- lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h	Thu Jan  1 05:30:00 1970
+++ lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h	Mon Nov 26 14:13:23 2001
@@ -0,0 +1,226 @@
+#ifndef _ASM_HW_IRQ_H
+#define _ASM_HW_IRQ_H
+
+/*
+ *	linux/include/asm/hw_irq.h
+ *
+ *	(C) 1992, 1993 Linus Torvalds, (C) 1997 Ingo Molnar
+ *
+ *	moved some of the old arch/i386/kernel/irq.h to here. VY
+ *
+ *	IRQ/IPI changes taken from work by Thomas Radke
+ *	<tomsoft@informatik.tu-chemnitz.de>
+ */
+
+#include <linux/config.h>
+#include <asm/atomic.h>
+#include <asm/irq.h>
+
+/*
+ * IDT vectors usable for external interrupt sources start
+ * at 0x20:
+ */
+#define FIRST_EXTERNAL_VECTOR	0x20
+
+#define SYSCALL_VECTOR		0x80
+
+/*
+ * Vectors 0x20-0x2f are used for ISA interrupts.
+ */
+
+/*
+ * Special IRQ vectors used by the SMP architecture, 0xf0-0xff
+ *
+ *  some of the following vectors are 'rare', they are merged
+ *  into a single vector (CALL_FUNCTION_VECTOR) to save vector space.
+ *  TLB, reschedule and local APIC vectors are performance-critical.
+ *
+ *  Vectors 0xf0-0xfa are free (reserved for future Linux use).
+ */
+#define SPURIOUS_APIC_VECTOR	0xff
+#define ERROR_APIC_VECTOR	0xfe
+#define INVALIDATE_TLB_VECTOR	0xfd
+#define RESCHEDULE_VECTOR	0xfc
+#define CALL_FUNCTION_VECTOR	0xfb
+#define DUMP_VECTOR		0xfa
+
+/*
+ * Local APIC timer IRQ vector is on a different priority level,
+ * to work around the 'lost local interrupt if more than 2 IRQ
+ * sources per level' errata.
+ */
+#define LOCAL_TIMER_VECTOR	0xef
+
+/*
+ * First APIC vector available to drivers: (vectors 0x30-0xee)
+ * we start at 0x31 to spread out vectors evenly between priority
+ * levels. (0x80 is the syscall vector)
+ */
+#define FIRST_DEVICE_VECTOR	0x31
+#define FIRST_SYSTEM_VECTOR	0xef
+
+extern int irq_vector[NR_IRQS];
+#define IO_APIC_VECTOR(irq)	irq_vector[irq]
+
+/*
+ * Various low-level irq details needed by irq.c, process.c,
+ * time.c, io_apic.c and smp.c
+ *
+ * Interrupt entry/exit code at both C and assembly level
+ */
+
+extern void mask_irq(unsigned int irq);
+extern void unmask_irq(unsigned int irq);
+extern void disable_8259A_irq(unsigned int irq);
+extern void enable_8259A_irq(unsigned int irq);
+extern int i8259A_irq_pending(unsigned int irq);
+extern void make_8259A_irq(unsigned int irq);
+extern void init_8259A(int aeoi);
+extern void FASTCALL(send_IPI_self(int vector));
+extern void init_VISWS_APIC_irqs(void);
+extern void setup_IO_APIC(void);
+extern void disable_IO_APIC(void);
+extern void print_IO_APIC(void);
+extern int IO_APIC_get_PCI_irq_vector(int bus, int slot, int fn);
+extern void send_IPI(int dest, int vector);
+
+extern unsigned long io_apic_irqs;
+
+extern atomic_t irq_err_count;
+extern atomic_t irq_mis_count;
+
+extern char _stext, _etext;
+
+#define IO_APIC_IRQ(x) (((x) >= 16) || ((1<<(x)) & io_apic_irqs))
+
+#define __STR(x) #x
+#define STR(x) __STR(x)
+
+#define SAVE_ALL \
+	"cld\n\t" \
+	"pushl %es\n\t" \
+	"pushl %ds\n\t" \
+	"pushl %eax\n\t" \
+	"pushl %ebp\n\t" \
+	"pushl %edi\n\t" \
+	"pushl %esi\n\t" \
+	"pushl %edx\n\t" \
+	"pushl %ecx\n\t" \
+	"pushl %ebx\n\t" \
+	"movl $" STR(__KERNEL_DS) ",%edx\n\t" \
+	"movl %edx,%ds\n\t" \
+	"movl %edx,%es\n\t"
+
+#define IRQ_NAME2(nr) nr##_interrupt(void)
+#define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr)
+
+#define GET_CURRENT \
+	"movl %esp, %ebx\n\t" \
+	"andl $-8192, %ebx\n\t"
+
+/*
+ *	SMP has a few special interrupts for IPI messages
+ */
+
+	/* there is a second layer of macro just to get the symbolic
+	   name for the vector evaluated. This change is for RTLinux */
+#define BUILD_SMP_INTERRUPT(x,v) XBUILD_SMP_INTERRUPT(x,v)
+#define XBUILD_SMP_INTERRUPT(x,v)\
+asmlinkage void x(void); \
+asmlinkage void call_##x(void); \
+__asm__( \
+"\n"__ALIGN_STR"\n" \
+SYMBOL_NAME_STR(x) ":\n\t" \
+	"pushl $"#v"\n\t" \
+	SAVE_ALL \
+	SYMBOL_NAME_STR(call_##x)":\n\t" \
+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
+	"jmp ret_from_intr\n");
+
+#define BUILD_SMP_TIMER_INTERRUPT(x,v) XBUILD_SMP_TIMER_INTERRUPT(x,v)
+#define XBUILD_SMP_TIMER_INTERRUPT(x,v) \
+asmlinkage void x(struct pt_regs * regs); \
+asmlinkage void call_##x(void); \
+__asm__( \
+"\n"__ALIGN_STR"\n" \
+SYMBOL_NAME_STR(x) ":\n\t" \
+	"pushl $"#v"\n\t" \
+	SAVE_ALL \
+	"movl %esp,%eax\n\t" \
+	"pushl %eax\n\t" \
+	SYMBOL_NAME_STR(call_##x)":\n\t" \
+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
+	"addl $4,%esp\n\t" \
+	"jmp ret_from_intr\n");
+
+#define BUILD_COMMON_IRQ() \
+asmlinkage void call_do_IRQ(void); \
+__asm__( \
+	"\n" __ALIGN_STR"\n" \
+	"common_interrupt:\n\t" \
+	SAVE_ALL \
+	"pushl $ret_from_intr\n\t" \
+	SYMBOL_NAME_STR(call_do_IRQ)":\n\t" \
+	"jmp "SYMBOL_NAME_STR(do_IRQ));
+
+/* 
+ * subtle. orig_eax is used by the signal code to distinct between
+ * system calls and interrupted 'random user-space'. Thus we have
+ * to put a negative value into orig_eax here. (the problem is that
+ * both system calls and IRQs want to have small integer numbers in
+ * orig_eax, and the syscall code has won the optimization conflict ;)
+ *
+ * Subtle as a pigs ear.  VY
+ */
+
+#define BUILD_IRQ(nr) \
+asmlinkage void IRQ_NAME(nr); \
+__asm__( \
+"\n"__ALIGN_STR"\n" \
+SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \
+	"pushl $"#nr"-256\n\t" \
+	"jmp common_interrupt");
+
+extern unsigned long prof_cpu_mask;
+extern unsigned int * prof_buffer;
+extern unsigned long prof_len;
+extern unsigned long prof_shift;
+
+/*
+ * x86 profiling function, SMP safe. We might want to do this in
+ * assembly totally?
+ */
+static inline void x86_do_profile (unsigned long eip)
+{
+	if (!prof_buffer)
+		return;
+
+	/*
+	 * Only measure the CPUs specified by /proc/irq/prof_cpu_mask.
+	 * (default is all CPUs.)
+	 */
+	if (!((1<<smp_processor_id()) & prof_cpu_mask))
+		return;
+
+	eip -= (unsigned long) &_stext;
+	eip >>= prof_shift;
+	/*
+	 * Don't ignore out-of-bounds EIP values silently,
+	 * put them into the last histogram slot, so if
+	 * present, they will show up as a sharp peak.
+	 */
+	if (eip > prof_len-1)
+		eip = prof_len-1;
+	atomic_inc((atomic_t *)&prof_buffer[eip]);
+}
+
+#ifdef CONFIG_SMP /*more of this file should probably be ifdefed SMP */
+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {
+	if (IO_APIC_IRQ(i))
+		send_IPI_self(IO_APIC_VECTOR(i));
+}
+#else
+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {}
+#endif
+
+#endif /* _ASM_HW_IRQ_H */
diff -urN -X dontdiff lkcd_cvs_orig/2.4/kernel/panic.c lkcd_cvs_new/2.4/kernel/panic.c
--- lkcd_cvs_orig/2.4/kernel/panic.c	Tue Oct 16 12:51:46 2001
+++ lkcd_cvs_new/2.4/kernel/panic.c	Mon Nov 26 17:33:33 2001
@@ -56,6 +56,10 @@
 #if defined(CONFIG_ARCH_S390)
         unsigned long caller = (unsigned long) __builtin_return_address(0);
 #endif
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	struct pt_regs regs;
+	get_current_regs(&regs);
+#endif
 
 	va_start(args, fmt);
 	vsprintf(buf, fmt, args);
@@ -78,7 +82,9 @@
 
 	notifier_call_chain(&panic_notifier_list, 0, NULL);
 
-	dump(buf, NULL);
+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
+	dump(buf, &regs);
+#endif
 
 	if (panic_timeout > 0)
 	{
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile	Fri Jan 26 02:42:01 2001
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile	Tue Nov 27 13:08:26 2001
@@ -8,7 +8,7 @@
 include $(DEPTH)/commondefs
 
 TARGETS   = $(DEPTH)/libarch.a
-CFILES    = i386_cmds.c cmd_mktrace.c 
+CFILES    = i386_cmds.c cmd_mktrace.c cmd_rd.c cmd_defcpu.c
 OFILES    = $(CFILES:.c=.o)
 
 all: default
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Thu Jan  1 05:30:00 1970
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Tue Nov 27 13:30:01 2001
@@ -0,0 +1,88 @@
+#include <lcrash.h>
+
+extern int get_dump_header_asm(dump_header_asm_t *);
+
+int defcpu = -1;
+
+/*
+ * deftask_cmd() -- Run the 'deftask' command.
+ */
+int
+defcpu_cmd(command_t *cmd)
+{
+	dump_header_asm_t dha;
+	int cpu;
+
+	if (cmd->nargs == 0) {
+		if (defcpu == -1) {
+			fprintf(cmd->efp, "No default cpu set\n");
+		} else {
+			fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
+		}
+		return(0);
+	} 
+
+	if (MIP->core_type != reg_core) {
+		fprintf(cmd->efp, "Can't use this command on live system\n");
+		return (1);
+	}
+	if (get_dump_header_asm(&dha))
+		return (1);
+
+	cpu = strtol(cmd->args[0], NULL, 10);
+
+	if (cpu >= dha.dha_smp_num_cpus) {
+		fprintf(cmd->efp, "Error setting defcpu to %s\n", cmd->args[0]);
+		return (1);
+	}
+	defcpu = cpu;
+	fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
+
+	if (dha.dha_stack[defcpu]) {
+		deftask = (kaddr_t)dha.dha_smp_current_task[defcpu];
+		fprintf(cmd->ofp, "Default task is 0x%x\n", deftask);
+	}
+	return (0);
+}
+
+#define _DEFCPU_USAGE	"[-w outfile] [cpu]"
+
+/*
+ * defcpu_usage() -- Print the usage string for the 'defcpu' command.
+ */
+void
+defcpu_usage(command_t *cmd)
+{
+	CMD_USAGE(cmd, _DEFCPU_USAGE);
+}
+
+/*
+ * defcpu_help() -- Print the help information for the 'defcpu' command.
+ */
+void
+defcpu_help(command_t *cmd)
+{
+	CMD_HELP(cmd, _DEFCPU_USAGE,
+	"Set the default cpu if one is indicated. Otherwise print the "
+	"value of default cpu." 
+        "When 'lcrash' is run on a live system, defcpu has no "
+        "meaning.\n\n"
+	"This command also sets the default task to the task running "
+	"on the default cpu at the time the dump is taken. "
+	"The rd command will display the registers on the default cpu "
+	"at the time the dump is taken. "
+        "The trace command will display a trace wrt the task "
+        "running on the default cpu at the time the dump is taken. ");
+}
+
+/*
+ * defcpu_parse() -- Parse the command line arguments for 'defcpu'.
+ */
+int
+defcpu_parse(command_t *cmd)
+{
+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
+		return(1);
+	}
+	return(0);
+}
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Thu Jan  1 05:30:00 1970
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Mon Nov 26 16:43:31 2001
@@ -0,0 +1,65 @@
+#include <lcrash.h>
+
+extern int get_dump_header_asm(dump_header_asm_t *dump_header_asm);
+extern int defcpu;
+
+#define _RD_USAGE "[-w outfile]"
+
+void
+rd_usage(command_t *cmd)
+{
+	CMD_USAGE(cmd, _RD_USAGE);
+}
+
+void
+rd_help(command_t *cmd)
+{
+	CMD_HELP(cmd, _RD_USAGE,
+			"Display the register contents of the default cpu."
+			"This command can't be used on a live system ");
+}
+
+int
+rd_parse(command_t *cmd)
+{
+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
+		return(1);
+	}
+	return 0;
+}
+
+int
+rd_cmd(command_t *cmd)
+{
+	dump_header_asm_t dha;
+	struct pt_regs * regs;
+
+	if (cmd->nargs != 0) {
+		rd_usage(cmd);
+		return(1);
+	}	
+
+	if (MIP->core_type != reg_core) {
+		fprintf(cmd->efp, "Can't use this command on live system\n");
+		return(1);
+	}
+	
+	if (get_dump_header_asm(&dha))
+		return(1);
+
+	if (defcpu == -1)
+		defcpu = dha.dha_dumping_cpu;
+	
+	regs = &dha.dha_smp_regs[defcpu];
+
+	fprintf(cmd->ofp, "CPU:    %d   EIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
+		defcpu, regs->xcs & 0xffff, regs->eip, regs->eflags);
+	fprintf(cmd->ofp, "eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
+		regs->eax, regs->ebx, regs->ecx, regs->edx);
+	fprintf(cmd->ofp, "esi: %08lx   edi: %08lx   ebp: %08lx   esp: %08lx\n",
+		regs->esi, regs->edi, regs->ebp, regs->esp);
+	fprintf(cmd->ofp, "ds: %04x   es: %04x   ss: %04x\n",
+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
+
+	return(0);
+}
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Fri Nov 17 05:06:51 2000
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Tue Nov 27 13:07:45 2001
@@ -6,8 +6,16 @@
 extern int mktrace_cmd(command_t *), mktrace_parse(command_t *);
 extern void mktrace_help(command_t *), mktrace_usage(command_t *);
 
+extern int rd_cmd(command_t *), rd_parse(command_t *);
+extern void rd_help(command_t *), rd_usage(command_t *);
+
+extern int defcpu_cmd(command_t *), defcpu_parse(command_t *);
+extern void defcpu_help(command_t *), defcpu_usage(command_t *);
+
 _command_t i386_cmdset[] = {
 	{"mktrace", 0, mktrace_cmd, mktrace_parse, mktrace_help, mktrace_usage},
 	{"mt", "mktrace" },
+	{"rd", 0, rd_cmd, rd_parse, rd_help, rd_usage},
+	{"defcpu", 0, defcpu_cmd, defcpu_parse, defcpu_help, defcpu_usage},
 	{(char *)0 }
 };
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c
--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c	Tue Jul  3 19:37:36 2001
+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c	Mon Nov 26 13:22:32 2001
@@ -741,9 +741,9 @@
 		return(1);
 	} else {
 		saddr = kl_kernelstack(task);
-		if (task == kl_dumptask()) {
-			eip = kl_dumpeip();
-			esp = kl_dumpesp();
+		if (kl_smp_dumptask(task)) {
+			eip = kl_dumpeip(task);
+			esp = kl_dumpesp(task);
 		} else {
 			if (LINUX_2_2_X(KL_LINUX_RELEASE)) {
 				eip = KL_UINT(K_PTR(tsp, "task_struct", "tss"), 
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c
--- lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c	Thu Oct 12 02:32:54 2000
+++ lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c	Mon Nov 26 13:11:08 2001
@@ -9,7 +9,7 @@
 /*
  * get_dump_header()
  */
-static int
+int
 get_dump_header(dump_header_t *dump_header)
 {
 	/* first, make sure this isn't a live system
@@ -42,7 +42,7 @@
 /*
  * get_dump_header_asm()
  */
-static int
+int
 get_dump_header_asm(dump_header_asm_t *dump_header_asm)
 {
 	dump_header_t dump_header;
@@ -90,36 +90,40 @@
  * kl_dumpesp()
  */
 kaddr_t
-kl_dumpesp(void)
+kl_dumpesp(kaddr_t tsk)
 {
-	dump_header_asm_t dump_header_asm;
+	dump_header_asm_t dha;
+	int i;
 
-	if (get_dump_header_asm(&dump_header_asm)) {
+	if (get_dump_header_asm(&dha)) {
 		return((kaddr_t)NULL);
 	}
-	if (dump_header_asm.dha_regs.esp) {
-		return((kaddr_t)dump_header_asm.dha_regs.esp);
-	} else { 
-		return((kaddr_t)dump_header_asm.dha_esp);
+	
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (tsk == dha.dha_smp_current_task[i])
+			return (dha.dha_smp_regs[i].esp);
 	}
+	return((kaddr_t)NULL);
 }
 
 /*
  * kl_dumpeip()
  */
 kaddr_t
-kl_dumpeip(void)
+kl_dumpeip(kaddr_t tsk)
 {
-	dump_header_asm_t dump_header_asm;
+	dump_header_asm_t dha;
+	int i;
 
-	if (get_dump_header_asm(&dump_header_asm)) {
+	if (get_dump_header_asm(&dha)) {
 		return((kaddr_t)NULL);
 	}
-	if (dump_header_asm.dha_regs.eip) {
-		return((kaddr_t)dump_header_asm.dha_regs.eip);
-	} else { 
-		return((kaddr_t)dump_header_asm.dha_eip);
+	
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (tsk == dha.dha_smp_current_task[i])
+			return (dha.dha_smp_regs[i].eip);
 	}
+	return((kaddr_t)NULL);
 }
 
 /*
@@ -134,5 +138,23 @@
 		return((kaddr_t)NULL);
 	}
 	return((kaddr_t)dump_header.dh_current_task);
+	
+}
+
+int
+kl_smp_dumptask(kaddr_t tsk)
+{
+	dump_header_asm_t dha;
+	int i;
+
+	if (get_dump_header_asm(&dha)) {
+		return((kaddr_t)NULL);
+	}
+	
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (dha.dha_smp_regs[i].eip > KL_PAGE_OFFSET && tsk == dha.dha_smp_current_task[i])
+			return (1);
+	}
+	return (0);	
 }
 
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h
--- lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h	Wed Sep  5 13:38:00 2001
+++ lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h	Mon Nov 26 15:41:09 2001
@@ -4,7 +4,8 @@
  * Created by: Matt Robinson (yakker@sgi.com)
  *
  * Copyright 1999 Silicon Graphics, Inc. All rights reserved.
- * 
+ *
+ * This code is released under version 2 of the GNU GPL.
  */
 
 /* This header file holds the architecture specific crash dump header */
@@ -13,6 +14,7 @@
 
 /* necessary header files */
 #include <asm/ptrace.h>                          /* for pt_regs             */
+#include <linux/threads.h>
 
 /* definitions */
 #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
@@ -44,17 +46,44 @@
 	/* the dump registers */
 	struct pt_regs       dha_regs;
 
+	/* smp specific */
+	uint32_t	     dha_smp_num_cpus;
+	int		     dha_dumping_cpu;	
+	struct pt_regs	     dha_smp_regs[NR_CPUS];
+	void *		     dha_smp_current_task[NR_CPUS];
+	void *		     dha_stack[NR_CPUS];
 } dump_header_asm_t;
 
 #ifdef __KERNEL__
-extern void __dump_open(struct file *, uint64_t);
-extern void __dump_init(uint64_t);
-extern void __dump_silence_system(void);
-extern void __dump_resume_system(void);
-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
-#ifdef CONFIG_X86
-extern void __dump_save_panic_regs(dump_header_asm_t *);
-#endif
+static inline void get_current_regs(struct pt_regs *regs)
+{
+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
+	regs->eip = (unsigned long)current_text_addr();
+	
+}
+
+extern volatile int dump_in_progress;
+extern unsigned long irq_affinity[];
+extern dump_header_asm_t dump_header_asm;
+
+#ifdef CONFIG_SMP
+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
+extern void dump_send_ipi(void);
+#else
+#define dump_send_ipi()
 #endif
+#endif /* __KERNEL__ */
 
 #endif /* _ASM_DUMP_H */
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h
--- lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h	Thu Oct 12 02:32:54 2000
+++ lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h	Mon Nov 26 13:17:16 2001
@@ -9,7 +9,8 @@
 int kl_parent_pid(void *);
 kaddr_t kl_pid_to_task(kaddr_t);
 k_error_t kl_get_task_struct(kaddr_t, int, void *);
-kaddr_t kl_dumpeip(void);
-kaddr_t kl_dumpesp(void);
+kaddr_t kl_dumpeip(kaddr_t tsk);
+kaddr_t kl_dumpesp(kaddr_t tsk);
+int kl_smp_dumptask(kaddr_t tsk);
 kaddr_t kl_dumptask(void);
 kaddr_t kl_kernelstack(kaddr_t);
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c lkcd_cvs_new/lkcdutils/libklib/kl_memory.c
--- lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c	Fri Nov 23 17:25:35 2001
+++ lkcd_cvs_new/lkcdutils/libklib/kl_memory.c	Mon Nov 26 13:15:58 2001
@@ -123,6 +123,34 @@
 	return((meminfo_t *)NULL);
 }
 
+extern int get_dump_header_asm(dump_header_asm_t *dha);
+kaddr_t
+__kl_fix_vaddr(kaddr_t vaddr, size_t sz)
+{
+	dump_header_asm_t dha;
+	kaddr_t cur_task;
+	int i;
+
+	if (MIP->core_type != reg_core) {
+		return vaddr;
+	}
+	if (get_dump_header_asm(&dha))
+		return vaddr;
+
+	/* this is a very simplistic check to see if we have saved 
+	 * (snapshotted) this particular block. This is very limited 
+	 * to finding the saved task structs only.
+	 */
+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
+		if (dha.dha_smp_regs[i].eip < KL_PAGE_OFFSET)
+			continue; /* if task is in user space, no need to look at saved stack */
+		cur_task = dha.dha_smp_current_task[i];
+		if (vaddr >= cur_task && vaddr + sz <  cur_task + KSTACK_SIZE)
+			return (dha.dha_stack[i] + (vaddr - cur_task));
+	}
+	return vaddr;
+}
+
 /*
  * get_block()
  * 
@@ -142,13 +170,16 @@
 		KL_ERROR = KLE_ZERO_SIZE;
 	} else {
 		while (size > 0){
+			kaddr_t tmp = vaddr;
 			s=((vaddr & KL_PAGE_MASK) | (~KL_PAGE_MASK)) - 
 				vaddr + 1;
 			s= (size > s) ? s : size;
+			vaddr = __kl_fix_vaddr(vaddr, s);	
 			if ( kl_virtop(vaddr, mmap, &paddr) ) {
 				return(KL_ERROR);
 			}
 			kl_readmem(paddr, s, bp);
+			vaddr = tmp;
 			size=size - s;
 			vaddr=vaddr + s;
 			bp=bp + s;
diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c
--- lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c	Fri Nov 23 17:25:37 2001
+++ lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c	Mon Nov 26 16:35:23 2001
@@ -242,7 +242,7 @@
 
 	/* set dump compression */
 	if (compress_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)&compress)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)compress)) < 0) {
 			perror("ioctl() for dump compression failed");
 			close(dfd);
 			return (err);
@@ -251,7 +251,7 @@
 
 	/* set dump flags */
 	if (flags_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)&flags)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)flags)) < 0) {
 			perror("ioctl() for dump flags failed");
 			close(dfd);
 			return (err);
@@ -260,7 +260,7 @@
 
 	/* set dump level */
 	if (level_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)&level)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)level)) < 0) {
 			perror("ioctl() for dump level failed");
 			close(dfd);
 			return (err);


From lkcd-general-owner@lists.sourceforge.net Tue Nov 27 16:07:07 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 168sFn-0006SJ-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 27 Nov 2001 16:07:07 -0800
Received: from pneumatic-tube.sgi.com (pneumatic-tube.sgi.com [204.94.214.22])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAS177o04794
	for <lkcd@oss.sgi.com>; Tue, 27 Nov 2001 17:07:07 -0800
Received: from loco.csd.sgi.com (loco.csd.sgi.com [130.62.73.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id QAA07504
	for <lkcd@oss.sgi.com>; Tue, 27 Nov 2001 16:07:00 -0800 (PST)
	mail_from (tjm@sgi.com)
Received: from striker (striker.csd.sgi.com [130.62.73.131]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id QAA30346; Tue, 27 Nov 2001 16:00:22 -0800 (PST)
Message-ID: <004f01c177a1$aa471630$83493e82@corp.sgi.com>
From: "Tom Morano" <tjm@sgi.com>
To: "Matt D. Robinson" <yakker@aparity.com>,
   "Andreas Herrmann" <AHERRMAN@de.ibm.com>
Cc: <lkcd@oss.sgi.com>
References: <Pine.LNX.4.30.0111221514040.11360-100000@nakedeye.aparity.com>
Subject: Re: [lkcd-general] dump and highmem
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 27 16:08:01 2001
X-Original-Date: Tue, 27 Nov 2001 16:14:36 -0800

Hi Matt,

I've been playing around with this some (partly to test my recent changes
and partly to
check into the problem that Valerie is having). I have a question about why
we are shifting
the address PAGE_SHIFT bits when we load it into the page header in the
dump? For
the i386 live dump that I generated, this results is a very wrong address
and weird hash
values. When this is combined with the recent changes you made for the
high_memory
issue, all I get is lcrash failing miserably during start-up (similar to
what Adreas saw).

When I changed the live_dump() code to store a physical address, then
everything
worked OK. What is the benefit of the shift? I'm concerned about this also
from an snia
point of view. We will have very complex system hardware/memory
configurations (large
amounts of memory on multiple nodes with large holes in the memory). I
should be able
to take any virtual address (mapped or non-mapped), convert it to a physical
address,
and then find it in the dump.

What am I missing here?

Thanks,

Tom


----- Original Message -----
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Andreas Herrmann" <AHERRMAN@de.ibm.com>
Cc: <lkcd@oss.sgi.com>
Sent: Thursday, November 22, 2001 3:16 PM
Subject: Re: [lkcd-general] dump and highmem


> Hi, Andreas.  I think it would be better if we fixed the
> livedump function to correct the dp_address mechanism to
> move it more in-line with the kernel code.  We could fix
> this in 'lcrash' itself, and differentiate the read, but
> we can fix it in the vmdump.c file and make sure it works
> for all cases.
>
> Let me fix this when I get back (Saturday).  Should be as
> simple as referencing mem_loc << lkcdinfo.page_shift into
> the dp_address.
>
> --Matt
>
> On Thu, 22 Nov 2001, Andreas Herrmann wrote:
> |>Hi,
> |>
> |>I tried out the current cvs version of lcrash. And oops, lcrash fails
while
> |>reading its own dumps, generated with
> |>lcrash's livedump command. I observed this on i386.
> |>
> |>Details: When initializing KL_HIGH_MEMORY, lcrash fails. Setting
> |>cmp_debug=1, I received following output:
> |>
> |>__cmppread(): initiating search for 0x2424a8
> |>__cmppindex(): hash =   8228, addr = 0x2424a8
> |>__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0xc0236000
> |>__cmppread(): page not found! (0x2424a8)
> |>
> |>Using cvs version 1.8 of file libklib/kl_cmp.c lcrash works fine. The
> |>corresponding output is:
> |>
> |>__cmppread(): initiating search for 0x2424a8
> |>__cmppindex(): hash =   8228, addr = 0x2424a8
> |>__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0x242000
> |>__cmppread(): found the page in the page index!
> |>0x242000: 1247 -> 4096 COMPRESSED, writing 4096 bytes
> |>__cmppinsert(): Malloc occurred! [0]
> |>__cmppinsert(): Inserting page into cache! (0x2424a8) [0]...
> |>
> |>Probably the hash values were mixed up ...
> |>
> |>As I found out, the error is caused by Kapish's  patch, which was
checked
> |>in on 11/14/2001.(See attached mails.)
> |>lcrash works fine with respect to livedumps when using version 1.8 of
file
> |>libklib/kl_cmp.c
> |>which contains
> |>
> |>paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
> |>
> |>instead of
> |>
> |>paddr = (kaddr_t)dp->dp_address;
> |>
> |>in function __cmpconvertaddr().
> |>
> |>
> |>Maybe someone has time to rework Kapish's patch to be "compatible" with
> |>livedumps, too?
> |>
> |>Regards,
> |>
> |>Andreas
> |>
> |>--
> |>Linux for eServer Development
> |>Tel :  +49-7031-16-4640
> |>Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
> |>email :  aherrman@de.ibm.com
> |>
> |>----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57
PM -----
> |>|--------+---------------------------------------->
> |>|        |          Kapish K <kapish@ureach.com>  |
> |>|        |          Sent by:                      |
> |>|        |          lkcd-general-admin@lists.sourc|
> |>|        |          eforge.net                    |
> |>|        |                                        |
> |>|        |                                        |
> |>|        |          11/13/01 06:28 AM             |
> |>|        |          Please respond to kapish      |
> |>|        |                                        |
> |>|--------+---------------------------------------->
> |>
>---------------------------------------------------------------------------
--------------------------|
> |>  |
|
> |>  |      To:     lkcd@oss.sgi.com
|
> |>  |      cc:
|
> |>  |      Subject:     [lkcd-general] dump and highmem
|
> |>  |
|
> |>  |
|
> |>
>---------------------------------------------------------------------------
--------------------------|
> |>
> |>
> |>
> |>Hello,
> |>           While trying to use lkcd and lcrash ( 4.0 ) on dumps from
> |>highmem enbaled boxes, a colleague noticed what might be a bug
> |>in the lkcd code.
> |>The error seems to occur when lcrash looks at the headrrs of
> |>loaded modules in the dump file, one of which is mapped at the
> |>highmemory region of physical memory.
> |>The dp_address field ( in add_dump_page ) is the virtual address
> |>( obtained frpm page_address(p) which gets the page->vitual
> |>address ), but for pages in highmemory, this would be zero
> |>unless the page was kampped at that point in time during dump.
> |>Is that right?
> |>When lcrash starts, it seems to build an index of physical
> |>pages, and
> |>it uses the dp_address fields to determine the real memory
> |>address by subtracting the kernel page offset (usually
> |>0xC0000000).  Thus, the real memory address of the high memory
> |>pages seem to be incorrect.
> |>We fixed this by changing the code so that dp_address is a
> |>real memory address rather than virtual.
> |>so, the changes we did were the following:
> |>in dump_base.c:
> |>--- drivers/dump/dump_base.c.orig         Wed Nov  7 02:54:53 2001
> |>+++ drivers/dump/dump_base.c        Fri Nov  9 02:24:05 2001
> |>@@ -486,13 +486,12 @@
> |> #if defined(CONFIG_X86) || defined(CONFIG_ALPHA)
> |>           extern int page_is_ram(unsigned long);
> |> #endif
> |>-          unsigned long addr, size;
> |>+          unsigned long size;
> |>           dump_page_t dp;
> |>           struct page *p = (struct page *)&(mem_map[mem_loc]);
> |>           void *vaddr;
> |>
> |>-          addr = (unsigned long)page_address(p);
> |>-          dp.dp_address = (uint64_t)addr;
> |>+          dp.dp_address = (uint64_t)mem_loc << PAGE_SHIFT;
> |>           dp.dp_flags = DUMP_DH_RAW;
> |>
> |>           /*
> |>in lkcdutils-1.0-7.src.rpm:
> |>
> |>--- libklib/kl_cmp.c.orig           Sat Jun 16 22:50:10 2001
> |>+++ libklib/kl_cmp.c           Thu Nov  8 18:11:43 2001
> |>@@ -920,7 +920,7 @@
> |> {
> |>           kaddr_t paddr;
> |>
> |>-          paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
> |>+          paddr = (kaddr_t)dp->dp_address;
> |>           return (paddr);
> |> }
> |>
> |>This seems to fix our problems with being able to look at trace
> |>records for pages in high memory.
> |>What I am looking for is whether this has already been
> |>identified by the lckd team as a problem, and if so, has the fix
> |>you plan the same? if this is not a problem, what have we missd
> |>in here? And finally, if this is a problem and the the solution
> |>is acceptable, could this change get into lkcd?
> |>TIA
> |>
> |>________________________________________________
> |>Get your own "800" number
> |>Voicemail, fax, email, and a lot more
> |>http://www.ureach.com/reg/tag
> |>
> |>_______________________________________________
> |>Lkcd-general mailing list
> |>Lkcd-general@lists.sourceforge.net
> |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> |>
> |>----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57
PM -----
> |>|--------+---------------------------------------->
> |>|        |          "Matt D. Robinson"            |
> |>|        |          <yakker@alacritech.com>       |
> |>|        |          Sent by:                      |
> |>|        |          lkcd-general-admin@lists.sourc|
> |>|        |          eforge.net                    |
> |>|        |                                        |
> |>|        |                                        |
> |>|        |          11/15/01 10:53 AM             |
> |>|        |          Please respond to "Matt D.    |
> |>|        |          Robinson"                     |
> |>|        |                                        |
> |>|--------+---------------------------------------->
> |>
>---------------------------------------------------------------------------
--------------------------|
> |>  |
|
> |>  |      To:     lkcd-general@lists.sourceforge.net
|
> |>  |      cc:
|
> |>  |      Subject:     [lkcd-general] LKCD experimental directory
available                              |
> |>  |
|
> |>  |
|
> |>
>---------------------------------------------------------------------------
--------------------------|
> |>
> |>
> |>
> |>So, given that I have only so many hours in the day in which
> |>to do testing of any given piece of code, and knowing that any
> |>and all changes for other people's code requires lots and lots
> |>of additional testing, and given that multiple releases are now
> |>being requested, plus working on my own code ...
> |>
> |>I've created an "experimental" directory on the download site
> |>on lkcd.sourceforge.net for those developers and bleeding edge
> |>testers who want the latest release of code, but in patch/RPM
> |>form.  I'd normally just release this against a single version,
> |>but I'm getting lots of RedHat requests, which, while reasonable,
> |>are numerous, and I don't have time to test each release.
> |>
> |>So here's the deal.
> |>
> |>I'll put out the patches, RPMs, etc., but I need some help
> |>testing these.   If you would be interested in testing LKCD for
> |>new releases, I can certainly provide information how to do just
> |>that.  I just can't cover 2.4.2-2, 2.4.3-12, 2.4.9-12, etc.,
> |>etc., etc., much less keep up with the Linus/Alan fiasco.
> |>
> |>In the experimental directory right now is the 4.0.1 release.
> |>The differences between 4.0 and 4.0.1 are:
> |>
> |>- Patch from Suparna to handle SCSI/SMP interrupt cases, and
> |>  add in an additional bit of code to deal with wakeup on
> |>  kiobufs when dumps are configured;
> |>
> |>- Patch from Kapish to deal with dump/highmem pages having an
> |>  invalid dp_address assigned to the page dump headers;
> |>
> |>- Patch for reported problem from Alex Aminoff to deal with
> |>  commented-out swap partitions in /etc/fstab being linked as
> |>  the primary dump device;
> |>
> |>- Small clean-ups with zlib.h check-in, spec file update, etc.
> |>  Full gzip compression included in the patch;
> |>
> |>Please let me know if I've totally broken the RH 7.1 patch, or
> |>if the linux-2.4.8 patch has a problem.  If things work, though,
> |>I'll post the rest of the RedHat patches for 7.1 and 7.2
> |>(including updates) for everyone.  That means 2.4.2-2, 2.4.3-12,
> |>2.4.7-10, 2.4.9-12, and 2.4.9-13 (whew!)
> |>
> |>All the files are in:
> |>
> |>           lkcd.sourceforge.net/download/experimental/<release>
> |>
> |>Thanks, any and all help is appreciated.
> |>
> |>--Matt
> |>
> |>_______________________________________________
> |>Lkcd-general mailing list
> |>Lkcd-general@lists.sourceforge.net
> |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> |>
> |>
> |>
> |>
> |>_______________________________________________
> |>Lkcd-general mailing list
> |>Lkcd-general@lists.sourceforge.net
> |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> |>
>
>
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general
>



From lkcd-general-owner@lists.sourceforge.net Tue Nov 27 20:52:15 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 168whi-0002dO-00
	for <lkcd-general@lists.sourceforge.net>; Tue, 27 Nov 2001 20:52:14 -0800
Received: from pneumatic-tube.sgi.com (pneumatic-tube.sgi.com [204.94.214.22])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAS5qEo02969
	for <lkcd@oss.sgi.com>; Tue, 27 Nov 2001 21:52:15 -0800
Received: from loco.csd.sgi.com (loco.csd.sgi.com [130.62.73.130]) by pneumatic-tube.sgi.com (980327.SGI.8.8.8-aspam/980310.SGI-aspam) via ESMTP id UAA05853
	for <lkcd@oss.sgi.com>; Tue, 27 Nov 2001 20:52:08 -0800 (PST)
	mail_from (tjm@sgi.com)
Received: from striker (striker.csd.sgi.com [130.62.73.131]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id UAA30695; Tue, 27 Nov 2001 20:47:45 -0800 (PST)
Message-ID: <006201c177c9$d3005230$83493e82@corp.sgi.com>
From: "Tom Morano" <tjm@sgi.com>
To: "Matt D. Robinson" <yakker@aparity.com>
Cc: <lkcd@oss.sgi.com>, "Tom Morano" <tjm@sgi.com>
References: <Pine.LNX.4.30.0111221514040.11360-100000@nakedeye.aparity.com> <004f01c177a1$aa471630$83493e82@corp.sgi.com>
Subject: Re: [lkcd-general] dump and highmem
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Tue Nov 27 20:53:01 2001
X-Original-Date: Tue, 27 Nov 2001 21:02:04 -0800

Well I saw (finally) what you did now...

You made a change to the lcrash/vmdump.c file to implement the same
functionality as the kernel. The only problem is that mem_loc in the kernel
is a physical address page number and mem_loc in the live_dump()
function is a kernel virtual address! Trying to use the PAGE_SHIFT
on the virtual address REALLY screwed things up. :]

I fixed that with a call to kl_virtop() (to convert the virtual address to a
physical address). This fixed the problem for the live dump that I was
having. I'm not sure what implications this might have with the high_memory
issue. Will high memory addresses be converted to physical addresses
properly by kl_virtop()? If so, then everything should work OK. If not, then
there might still be a problem (that should be fixed in the kl_virtop()
function).

Tom

----- Original Message -----
From: "Tom Morano" <tjm@sgi.com>
To: "Matt D. Robinson" <yakker@aparity.com>; "Andreas Herrmann"
<AHERRMAN@de.ibm.com>
Cc: <lkcd@oss.sgi.com>
Sent: Tuesday, November 27, 2001 4:14 PM
Subject: Re: [lkcd-general] dump and highmem


> Hi Matt,
>
> I've been playing around with this some (partly to test my recent changes
> and partly to
> check into the problem that Valerie is having). I have a question about
why
> we are shifting
> the address PAGE_SHIFT bits when we load it into the page header in the
> dump? For
> the i386 live dump that I generated, this results is a very wrong address
> and weird hash
> values. When this is combined with the recent changes you made for the
> high_memory
> issue, all I get is lcrash failing miserably during start-up (similar to
> what Adreas saw).
>
> When I changed the live_dump() code to store a physical address, then
> everything
> worked OK. What is the benefit of the shift? I'm concerned about this also
> from an snia
> point of view. We will have very complex system hardware/memory
> configurations (large
> amounts of memory on multiple nodes with large holes in the memory). I
> should be able
> to take any virtual address (mapped or non-mapped), convert it to a
physical
> address,
> and then find it in the dump.
>
> What am I missing here?
>
> Thanks,
>
> Tom
>
>
> ----- Original Message -----
> From: "Matt D. Robinson" <yakker@aparity.com>
> To: "Andreas Herrmann" <AHERRMAN@de.ibm.com>
> Cc: <lkcd@oss.sgi.com>
> Sent: Thursday, November 22, 2001 3:16 PM
> Subject: Re: [lkcd-general] dump and highmem
>
>
> > Hi, Andreas.  I think it would be better if we fixed the
> > livedump function to correct the dp_address mechanism to
> > move it more in-line with the kernel code.  We could fix
> > this in 'lcrash' itself, and differentiate the read, but
> > we can fix it in the vmdump.c file and make sure it works
> > for all cases.
> >
> > Let me fix this when I get back (Saturday).  Should be as
> > simple as referencing mem_loc << lkcdinfo.page_shift into
> > the dp_address.
> >
> > --Matt
> >
> > On Thu, 22 Nov 2001, Andreas Herrmann wrote:
> > |>Hi,
> > |>
> > |>I tried out the current cvs version of lcrash. And oops, lcrash fails
> while
> > |>reading its own dumps, generated with
> > |>lcrash's livedump command. I observed this on i386.
> > |>
> > |>Details: When initializing KL_HIGH_MEMORY, lcrash fails. Setting
> > |>cmp_debug=1, I received following output:
> > |>
> > |>__cmppread(): initiating search for 0x2424a8
> > |>__cmppindex(): hash =   8228, addr = 0x2424a8
> > |>__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0xc0236000
> > |>__cmppread(): page not found! (0x2424a8)
> > |>
> > |>Using cvs version 1.8 of file libklib/kl_cmp.c lcrash works fine. The
> > |>corresponding output is:
> > |>
> > |>__cmppread(): initiating search for 0x2424a8
> > |>__cmppindex(): hash =   8228, addr = 0x2424a8
> > |>__cmppindex(): addr = 0x2424a8, tmpptr->addr = 0x242000
> > |>__cmppread(): found the page in the page index!
> > |>0x242000: 1247 -> 4096 COMPRESSED, writing 4096 bytes
> > |>__cmppinsert(): Malloc occurred! [0]
> > |>__cmppinsert(): Inserting page into cache! (0x2424a8) [0]...
> > |>
> > |>Probably the hash values were mixed up ...
> > |>
> > |>As I found out, the error is caused by Kapish's  patch, which was
> checked
> > |>in on 11/14/2001.(See attached mails.)
> > |>lcrash works fine with respect to livedumps when using version 1.8 of
> file
> > |>libklib/kl_cmp.c
> > |>which contains
> > |>
> > |>paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
> > |>
> > |>instead of
> > |>
> > |>paddr = (kaddr_t)dp->dp_address;
> > |>
> > |>in function __cmpconvertaddr().
> > |>
> > |>
> > |>Maybe someone has time to rework Kapish's patch to be "compatible"
with
> > |>livedumps, too?
> > |>
> > |>Regards,
> > |>
> > |>Andreas
> > |>
> > |>--
> > |>Linux for eServer Development
> > |>Tel :  +49-7031-16-4640
> > |>Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
> > |>email :  aherrman@de.ibm.com
> > |>
> > |>----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57
> PM -----
> > |>|--------+---------------------------------------->
> > |>|        |          Kapish K <kapish@ureach.com>  |
> > |>|        |          Sent by:                      |
> > |>|        |          lkcd-general-admin@lists.sourc|
> > |>|        |          eforge.net                    |
> > |>|        |                                        |
> > |>|        |                                        |
> > |>|        |          11/13/01 06:28 AM             |
> > |>|        |          Please respond to kapish      |
> > |>|        |                                        |
> > |>|--------+---------------------------------------->
> > |>
>
>---------------------------------------------------------------------------
> --------------------------|
> > |>  |
> |
> > |>  |      To:     lkcd@oss.sgi.com
> |
> > |>  |      cc:
> |
> > |>  |      Subject:     [lkcd-general] dump and highmem
> |
> > |>  |
> |
> > |>  |
> |
> > |>
>
>---------------------------------------------------------------------------
> --------------------------|
> > |>
> > |>
> > |>
> > |>Hello,
> > |>           While trying to use lkcd and lcrash ( 4.0 ) on dumps from
> > |>highmem enbaled boxes, a colleague noticed what might be a bug
> > |>in the lkcd code.
> > |>The error seems to occur when lcrash looks at the headrrs of
> > |>loaded modules in the dump file, one of which is mapped at the
> > |>highmemory region of physical memory.
> > |>The dp_address field ( in add_dump_page ) is the virtual address
> > |>( obtained frpm page_address(p) which gets the page->vitual
> > |>address ), but for pages in highmemory, this would be zero
> > |>unless the page was kampped at that point in time during dump.
> > |>Is that right?
> > |>When lcrash starts, it seems to build an index of physical
> > |>pages, and
> > |>it uses the dp_address fields to determine the real memory
> > |>address by subtracting the kernel page offset (usually
> > |>0xC0000000).  Thus, the real memory address of the high memory
> > |>pages seem to be incorrect.
> > |>We fixed this by changing the code so that dp_address is a
> > |>real memory address rather than virtual.
> > |>so, the changes we did were the following:
> > |>in dump_base.c:
> > |>--- drivers/dump/dump_base.c.orig         Wed Nov  7 02:54:53 2001
> > |>+++ drivers/dump/dump_base.c        Fri Nov  9 02:24:05 2001
> > |>@@ -486,13 +486,12 @@
> > |> #if defined(CONFIG_X86) || defined(CONFIG_ALPHA)
> > |>           extern int page_is_ram(unsigned long);
> > |> #endif
> > |>-          unsigned long addr, size;
> > |>+          unsigned long size;
> > |>           dump_page_t dp;
> > |>           struct page *p = (struct page *)&(mem_map[mem_loc]);
> > |>           void *vaddr;
> > |>
> > |>-          addr = (unsigned long)page_address(p);
> > |>-          dp.dp_address = (uint64_t)addr;
> > |>+          dp.dp_address = (uint64_t)mem_loc << PAGE_SHIFT;
> > |>           dp.dp_flags = DUMP_DH_RAW;
> > |>
> > |>           /*
> > |>in lkcdutils-1.0-7.src.rpm:
> > |>
> > |>--- libklib/kl_cmp.c.orig           Sat Jun 16 22:50:10 2001
> > |>+++ libklib/kl_cmp.c           Thu Nov  8 18:11:43 2001
> > |>@@ -920,7 +920,7 @@
> > |> {
> > |>           kaddr_t paddr;
> > |>
> > |>-          paddr = (kaddr_t)dp->dp_address - KL_PAGE_OFFSET;
> > |>+          paddr = (kaddr_t)dp->dp_address;
> > |>           return (paddr);
> > |> }
> > |>
> > |>This seems to fix our problems with being able to look at trace
> > |>records for pages in high memory.
> > |>What I am looking for is whether this has already been
> > |>identified by the lckd team as a problem, and if so, has the fix
> > |>you plan the same? if this is not a problem, what have we missd
> > |>in here? And finally, if this is a problem and the the solution
> > |>is acceptable, could this change get into lkcd?
> > |>TIA
> > |>
> > |>________________________________________________
> > |>Get your own "800" number
> > |>Voicemail, fax, email, and a lot more
> > |>http://www.ureach.com/reg/tag
> > |>
> > |>_______________________________________________
> > |>Lkcd-general mailing list
> > |>Lkcd-general@lists.sourceforge.net
> > |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> > |>
> > |>----- Forwarded by Andreas Herrmann/Germany/IBM on 11/22/01 06:57
> PM -----
> > |>|--------+---------------------------------------->
> > |>|        |          "Matt D. Robinson"            |
> > |>|        |          <yakker@alacritech.com>       |
> > |>|        |          Sent by:                      |
> > |>|        |          lkcd-general-admin@lists.sourc|
> > |>|        |          eforge.net                    |
> > |>|        |                                        |
> > |>|        |                                        |
> > |>|        |          11/15/01 10:53 AM             |
> > |>|        |          Please respond to "Matt D.    |
> > |>|        |          Robinson"                     |
> > |>|        |                                        |
> > |>|--------+---------------------------------------->
> > |>
>
>---------------------------------------------------------------------------
> --------------------------|
> > |>  |
> |
> > |>  |      To:     lkcd-general@lists.sourceforge.net
> |
> > |>  |      cc:
> |
> > |>  |      Subject:     [lkcd-general] LKCD experimental directory
> available                              |
> > |>  |
> |
> > |>  |
> |
> > |>
>
>---------------------------------------------------------------------------
> --------------------------|
> > |>
> > |>
> > |>
> > |>So, given that I have only so many hours in the day in which
> > |>to do testing of any given piece of code, and knowing that any
> > |>and all changes for other people's code requires lots and lots
> > |>of additional testing, and given that multiple releases are now
> > |>being requested, plus working on my own code ...
> > |>
> > |>I've created an "experimental" directory on the download site
> > |>on lkcd.sourceforge.net for those developers and bleeding edge
> > |>testers who want the latest release of code, but in patch/RPM
> > |>form.  I'd normally just release this against a single version,
> > |>but I'm getting lots of RedHat requests, which, while reasonable,
> > |>are numerous, and I don't have time to test each release.
> > |>
> > |>So here's the deal.
> > |>
> > |>I'll put out the patches, RPMs, etc., but I need some help
> > |>testing these.   If you would be interested in testing LKCD for
> > |>new releases, I can certainly provide information how to do just
> > |>that.  I just can't cover 2.4.2-2, 2.4.3-12, 2.4.9-12, etc.,
> > |>etc., etc., much less keep up with the Linus/Alan fiasco.
> > |>
> > |>In the experimental directory right now is the 4.0.1 release.
> > |>The differences between 4.0 and 4.0.1 are:
> > |>
> > |>- Patch from Suparna to handle SCSI/SMP interrupt cases, and
> > |>  add in an additional bit of code to deal with wakeup on
> > |>  kiobufs when dumps are configured;
> > |>
> > |>- Patch from Kapish to deal with dump/highmem pages having an
> > |>  invalid dp_address assigned to the page dump headers;
> > |>
> > |>- Patch for reported problem from Alex Aminoff to deal with
> > |>  commented-out swap partitions in /etc/fstab being linked as
> > |>  the primary dump device;
> > |>
> > |>- Small clean-ups with zlib.h check-in, spec file update, etc.
> > |>  Full gzip compression included in the patch;
> > |>
> > |>Please let me know if I've totally broken the RH 7.1 patch, or
> > |>if the linux-2.4.8 patch has a problem.  If things work, though,
> > |>I'll post the rest of the RedHat patches for 7.1 and 7.2
> > |>(including updates) for everyone.  That means 2.4.2-2, 2.4.3-12,
> > |>2.4.7-10, 2.4.9-12, and 2.4.9-13 (whew!)
> > |>
> > |>All the files are in:
> > |>
> > |>           lkcd.sourceforge.net/download/experimental/<release>
> > |>
> > |>Thanks, any and all help is appreciated.
> > |>
> > |>--Matt
> > |>
> > |>_______________________________________________
> > |>Lkcd-general mailing list
> > |>Lkcd-general@lists.sourceforge.net
> > |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> > |>
> > |>
> > |>
> > |>
> > |>_______________________________________________
> > |>Lkcd-general mailing list
> > |>Lkcd-general@lists.sourceforge.net
> > |>https://lists.sourceforge.net/lists/listinfo/lkcd-general
> > |>
> >
> >
> > _______________________________________________
> > Lkcd-general mailing list
> > Lkcd-general@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/lkcd-general
> >
>
>
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general
>



From lkcd-general-owner@lists.sourceforge.net Wed Nov 28 00:14:21 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 168zrC-000540-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 28 Nov 2001 00:14:14 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAS9EEo07572
	for <lkcd@oss.sgi.com>; Wed, 28 Nov 2001 01:14:15 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAS8ILQ28213;
	Wed, 28 Nov 2001 00:18:22 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Tom Morano <tjm@sgi.com>
cc: Andreas Herrmann <AHERRMAN@de.ibm.com>, <lkcd@oss.sgi.com>
Subject: Re: [lkcd-general] dump and highmem
In-Reply-To: <004f01c177a1$aa471630$83493e82@corp.sgi.com>
Message-ID: <Pine.LNX.4.30.0111280007270.28096-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 28 00:15:02 2001
X-Original-Date: Wed, 28 Nov 2001 00:18:21 -0800 (PST)

I included Kapish's change with an understanding that the
highmem modification wouldn't matter for page shifts so long
as you took care of it in lkcdutils (both ends, essentially).
What I think, though, is that this problem can be remedied by
putting the code back to the original base, but fixing the
dump_base.c code to instead set the dp_address value based
on the return value from kmap_atomic().  The question I have
is whether this will be sufficient to fix Kapish's problem.

Kapish, since you have a highmem system handy, can you try to
set the dp.dp_address to the vaddr value returned from the
kmap_atomic() call when PageHighMem(p) is set, and let me know
if that fixes your problem?

If so, that deals with Kapish's problem without breaking the
original dp_address value.

In the meantime, though, I think it's better to move back to
the older model, and try to address Kapish's issue separately.

--Matt

On Tue, 27 Nov 2001, Tom Morano wrote:
|>Hi Matt,
|>
|>I've been playing around with this some (partly to test my recent changes
|>and partly to check into the problem that Valerie is having). I have a
|>question about why we are shifting the address PAGE_SHIFT bits when we
|>load it into the page header in the dump?



From lkcd-general-owner@lists.sourceforge.net Wed Nov 28 01:30:46 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 16911Z-0007LV-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 28 Nov 2001 01:29:01 -0800
Received: from ausmtp01.au.ibm.com (ausmtp01.au.ibm.COM [202.135.136.97])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fASASso09318
	for <lkcd@oss.sgi.com>; Wed, 28 Nov 2001 02:28:55 -0800
Received: from f02n15e.au.ibm.com 
        by ausmtp01.au.ibm.com (IBM AP 2.0) with ESMTP id fAS9Off140190;
        Wed, 28 Nov 2001 20:24:41 +1100
Received: from d23m0062.in.ibm.com (d23m0062.in.ibm.com [9.184.199.181])
	by f02n15e.au.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fAS9Qih23100;
	Wed, 28 Nov 2001 20:26:45 +1100
X-Priority: 1 (High)
Subject: Re: [lkcd-general] dump and highmem
To: "Matt D. Robinson" <yakker@aparity.com>
Cc: Andreas_Herrmann/Germany/IBM%IBMDE <aherrman@de.ibm.com>, lkcd@oss.sgi.com,
   lkcd-general-admin@lists.sourceforge.net, Tom Morano <tjm@sgi.com>
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OFA83277D2.6D2D1302-ON65256B12.0032EFF4@in.ibm.com>
From: "Suparna Bhattacharya" <bsuparna@in.ibm.com>
X-MIMETrack: Serialize by Router on d23m0062/23/M/IBM(Release 5.0.8 |June 18, 2001) at
 28/11/2001 02:57:33 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 28 01:31:03 2001
X-Original-Date: Wed, 28 Nov 2001 14:57:26 +0530


I haven't looked into lcrash much, but I thought you'd need a unique value
of dp.dp_address per page (even for high mem pages) . How would using the
vaddr returned by kmap_atomic work  ?

 Regards
 Suparna

  Suparna Bhattacharya
  Linux Technology Center
  IBM Software Lab, India
  E-mail : bsuparna@in.ibm.com
  Phone :  91-80-5044961



                                                                                                                                   
                    "Matt D. Robinson"                                                                                             
                    <yakker@aparity.com>                 To:     Tom Morano <tjm@sgi.com>                                          
                    Sent by:                             cc:     Andreas Herrmann/Germany/IBM@IBMDE, <lkcd@oss.sgi.com>            
                    lkcd-general-admin@lists.sourc       Subject:     Re: [lkcd-general] dump and highmem                          
                    eforge.net                                                                                                     
                                                                                                                                   
                                                                                                                                   
                    11/28/01 01:48 PM                                                                                              
                                                                                                                                   
                                                                                                                                   




I included Kapish's change with an understanding that the
highmem modification wouldn't matter for page shifts so long
as you took care of it in lkcdutils (both ends, essentially).
What I think, though, is that this problem can be remedied by
putting the code back to the original base, but fixing the
dump_base.c code to instead set the dp_address value based
on the return value from kmap_atomic().  The question I have
is whether this will be sufficient to fix Kapish's problem.

Kapish, since you have a highmem system handy, can you try to
set the dp.dp_address to the vaddr value returned from the
kmap_atomic() call when PageHighMem(p) is set, and let me know
if that fixes your problem?

If so, that deals with Kapish's problem without breaking the
original dp_address value.

In the meantime, though, I think it's better to move back to
the older model, and try to address Kapish's issue separately.

--Matt

_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general





From lkcd-general-owner@lists.sourceforge.net Wed Nov 28 01:42:22 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 1691EO-0000y2-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 28 Nov 2001 01:42:16 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fASAgGo09652
	for <lkcd@oss.sgi.com>; Wed, 28 Nov 2001 02:42:16 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAS9k3628328;
	Wed, 28 Nov 2001 01:46:03 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: Suparna Bhattacharya <bsuparna@in.ibm.com>
cc: Andreas_Herrmann/Germany/IBM%IBMDE <aherrman@de.ibm.com>,
   <lkcd@oss.sgi.com>, <lkcd-general-admin@lists.sourceforge.net>,
   Tom Morano <tjm@sgi.com>
Subject: Re: [lkcd-general] dump and highmem
In-Reply-To: <OFA83277D2.6D2D1302-ON65256B12.0032EFF4@in.ibm.com>
Message-ID: <Pine.LNX.4.30.0111280138160.28096-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 28 01:43:02 2001
X-Original-Date: Wed, 28 Nov 2001 01:46:03 -0800 (PST)

On Wed, 28 Nov 2001, Suparna Bhattacharya wrote:
|>I haven't looked into lcrash much, but I thought you'd need a unique value
|>of dp.dp_address per page (even for high mem pages) . How would using the
|>vaddr returned by kmap_atomic work  ?

You do, and you're right.  max_mapnr already accounts for the
highend_pfn value with CONFIG_HIGHMEM, so the mem_loc is the
right value to use for dp_address in all cases.  I just went
and looked at this again in the init.c/setup.c arch code.

In any event, the code is back to the way it was.  Sorry for
any additional confusion, I'll stop now before I make things
worse. :)

--Matt



From matsuoka@css1.kbnes.nec.co.jp Wed Nov 28 21:55:07 2001
Received: from tyo202.gate.nec.co.jp ([202.247.6.41])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169KA2-00024N-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 28 Nov 2001 21:55:03 -0800
Received: from mailgate4.nec.co.jp ([10.7.69.195])
	by TYO202.gate.nec.co.jp (8.11.6/3.7W01080315) with ESMTP id fAT5svO22081;
	Thu, 29 Nov 2001 14:54:57 +0900 (JST)
Received: from mailsv4.nec.co.jp (mailgate51.nec.co.jp [10.7.69.190]) by mailgate4.nec.co.jp (8.11.6/3.7W-MAILGATE-NEC) with ESMTP
	id fAT5sra28105; Thu, 29 Nov 2001 14:54:53 +0900 (JST)
Received: from sv3l.kbnes.nec.co.jp (mailsv1.kbnes.nec.co.jp [10.108.80.233]) by mailsv4.nec.co.jp (8.11.6/3.7W-MAILSV4-NEC) with ESMTP
	id fAT5sqi24358; Thu, 29 Nov 2001 14:54:52 +0900 (JST)
Received: from phoebe.css1.kbnes.nec.co.jp ([172.16.32.5])
	by sv3l.kbnes.nec.co.jp (8.11.2/3.7W/01/24/01) with ESMTP id fAT5spd32806;
	Thu, 29 Nov 2001 14:54:51 +0900 (JST)
Received: from dragoon.css1.kbnes.nec.co.jp (dragoon [172.16.32.8])
	by phoebe.css1.kbnes.nec.co.jp (8.8.8+2.7Wbeta7/3.6W-nomx) with ESMTP id GAA21043;
	Thu, 29 Nov 2001 06:01:49 GMT
Received: from math2 ([172.16.28.219]) by dragoon.css1.kbnes.nec.co.jp (8.8.8+2.7Wbeta7/3.5Wpl7-nomx) with ESMTP id OAA29103; Thu, 29 Nov 2001 14:54:47 +0900 (JST)
From: Ken-ichi Matsuoka <matsuoka@css1.kbnes.nec.co.jp>
To: lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net
Cc: r-shibano@pb.jp.nec.com
Message-Id: <20011129144552.A43B.MATSUOKA@css1.kbnes.nec.co.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.00.07
Subject: [lkcd-general] [Proposal] dump dedicated driver with polling method
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 28 21:56:02 2001
X-Original-Date: Thu, 29 Nov 2001 14:54:13 +0900

Hi, all

To dump the memory image when crashing on the badly damaged system, 
we think reliable dump I/O is required at first.

But current LKCD unfortunately cannot dump the memory image when crashing, 
so we cannot investigate crash dump. (The detail is below.)

To solve this issue, we propose the dump dedicated driver, which uses polling 
method, manages necessary buffers by itself.

Any comment is welcomed.


0. Table of Contents
  1. The abstract of proposal
    1.1. Purpose
    1.2. LKCD's problem
    1.3. Our proposal
    1.4. How to proceed
    1.5. Schedule

  2. The details of problems
    2.1. The case of the cause 1
    2.2. The case of the cause 2
  3. The details of proposal
    3-1. The idea to the cause 1
    3-2. The idea to the cause 2


The following is the abstract of proposal:

1. The abstract of proposal
  1.1. The purpose
       To dump the memory image when crashing on the badly damaged system, 
       We aim the following:
 
         We advance LKCD's implementation so that LKCD can dump the memory 
         image in any status of the system. 

  1.2. LKCD's problem

       There are the following cases that we cannot investigate crash dump:

       [Issue]
          LKCD cannot dump the memory image when crashing, so we cannot
          investigate crash dump.
  
       The cause of this issue is linked to below.

       (Cause 1) LKCD doesn't disable the interruption during dump I/O, 
                 so the interrupt handlers of other drivers are processed.
                 The interrupt handler modifies kernel resources during 
                 dump I/O.

       (Cause 2) LKCD uses original raw I/O.
                 Raw I/O uses kernel resources, so the following happens:

                 (A) Kernel resources are modified during dump I/O.
                 (B) If the dumped device has already had I/O requests, 
                     these requests are processed during dump I/O.

  1.3. Our proposal
       The followings are our ideas to implement above causes:
       
         (Against 1) dump driver with polling method 
         (Against 2) LKCD personally allocates the resources of I/O requests 
                     and manages them by replacing the original ones.

       To do, We propose dump dedicated driver, which uses polling method,
       manages necessary buffers by itself.


  1.4. How to proceed
       We proceed the following:
         
       1. Driver Developer develops the dedicated driver by chip.
          To fix the interface between the dump upper driver
          and the dedicated driver by chip is required.
          At first, We make the specification draft.
       
       2. We provide the specification draft to LKCD-ML, 
          discuss  with you, and fix it.

  1.5. Schedule
       At the end of Dec. 2001 Specification Draft 
                               : We provide the specification draft to
                                 LKCD-ML.
       At the end of Apr. 2002 Specification Rev. 1
                               : Fix specification with you
                             
2. The details of problems
  2.1. The case of the cause 1
       (Modified kernel resources in the interruption processing)

       Case1-1. Timer list are modified.
              (What happens) 
                Timer routine is called by timer_bh which is the bottom half
                handler of timer interruption.
                And timer function of the device driver is processed in timer
                routine, so the status or control flags of device driver are 
                modified by the timer function.
              
              (What troubles)
                The status or control flags of device driver are modified, so
                you cannot grasp the status of the device driver at the time 
                of software failure and cannot investigate the cause of it.

                For example:
                  When the network driver has a trouble, the status of
                  network driver is modified during dump I/O.
                  So you cannot investigate.
                  
       Case1-2. The resources of the device driver which uses software 
                interrupt handler are modified.

              (What happens) 
                Software interrupt handler is called, so the resources of
                the device driver using it are modified. 

              (What troubles)
                You cannot investigate the resources of the device driver
                using software interrupt handler.

                For example, when software interrupt handler for network 
                receiving routine is triggered, the receive routine is 
                processed, and the status of socket are modified.
                So you cannot investigate them.

  2.2. The case of the cause 2
       (Modified kernel resources in the raw I/O processing)

       2.2.1. The case of the cause 2-(A)

       Case2-1. The resources of the "struct request" are modified.

              (What happens) 
                 These resources are allocated by the device driver 
                 at the initial phase, and are managed by Kernel block I/O.
                 But dump I/O processes the following processes, so these ones
                 are modified.

                   By dump I/O, the "struct request" is taken from free 
                   request queue (in get_request), and this one is 
                   enqueued to the request queue of the device driver
                   (in add_request). So these request queues are modified.
                    
              (What troubles)
                 At the error of I/O routine, you cannot know what request is
                 the current request and what the status of request is. 

       Case2-2. The resources of the "struct scsi_cmnd" are modified.
              (What happens) 
                 These resources are allocated by SCSI driver at the 
                 initializing phase, and are managed by SCSI driver.
                 But dump I/O processes the following processes, so these ones 
                 are modified.
                   
                   After enqueuing request by dump I/O, the "struct scsi_cmnd"
                   is taken from free list (in scsi_allocate_device),
                   and is dispatched to low-level driver (in scsi_dispatch_cmd).
                   So the resources of the "struct scsi_cmnd" are modified.
 
              (What troubles)
                  At the error of I/O routine, you cannot know what command is
                  processed and what the status of the command is.

       2.2.2. The case of the cause 2-(B)
  
       Case2-3. The resources of the "struct request" and the 
                "struct scsi_cmnd" are modified.
              (What happens) 
                 If the dumped device has already had I/O requests, that is, 
                 if the dumped device has already enqueued some requests, 
                 these ones are processed by dump I/O.
  
                 For example, in the case that the disk connected in H/W raid 
                 has I/O trouble, if you want to dump memory to SCSI disk, 
                 the requests enqueued in SCSI disk has processed.             

              (What troubles)
                 By dump I/O, the status of requests are modified, you cannot
                 investigate the status of requests when crashing.


3. The details of proposal

We propose the following ideas in order not to happen these cases:

  3-1. The idea to the cause 1

       [Proposal] dump driver with polling method
         
       The background of this proposal is below.

       To implement the cause 1, the I/O method which doesn't modify
       kernel resources in interrupt processing is required.
       We have considered 5 methods:

        (1) Disable all interruption by the processor function,
            such as clear interrupt flag (cli) routine on IA-32.
        (2) Disable all interruption by the interrupt controller.
        (3) Disable the interruption except for timer interruption 
            and the interruption for dump I/O.
        (4) In addition to (3), remove the timer routine 
            and software interrupt handler which are independent
            of dump I/O.
        (5) Use of polling mechanism, Disable all interruption

       [The advantage and fault of each method]
        (1) You cannot disable the interruption, because kernel
            or the device driver has the routine which enables 
            the interruption.
            That is, Modification of kernel resources happens.
        (2) Modification of kernel resources doesn't happen.
            But you cannot execute waiting routine of I/O 
            completion and the routine of I/O timeout, because
            all interruption is disabled.
        (3) Modification of kernel resources happens, because
            timer routine and software interrupt handler are
            called.
        (4) Modification of kernel resources doesn't happen.
            To do this method, you need the following processes, 
            But these processes are unreal.
            
            (A) Disable timer routine except for dump I/O routine
            (B) Disable software interrupt handlers except for 
                timer_bh and scsi_bottom_half_handler

        (5) Above problems doesn't happen, because in this method,
            dedicated driver manage all I/O routine such as
            waiting routine of I/O completion, the routine of
            I/O timeout. 
            
       [The result of our consideration]
        To implement the cause 1, I propose the method (5).


  3-2. The idea to the cause 2
       The following idea is the implementation against the cause 2:

       [Proposal] LKCD personally allocates the resources of I/O requests and 
                  manages them by replacing the original ones.
 
       The background of this proposal is below.
       
       To implement the cause 2, not to use kernel resources is required.
       
       We consider the following idea:

       [How to proceed]
        Replace the original resources linked in the device driver 
        with the resources allocated by LKCD.

       [To do]
        The following processes are required to change resources:

        (1) During the configuration of dump system, allocate the resources
            which will be replaced original resources with. 
        (2) The queue linked in the device driver (such as request)
            is depended on the state of chip very much, because 
            not only Kernel I/O but also the device driver uses 
            this queue. 
            But it is difficult to grasp the state of the chip.
            So to reset device driver before dump I/O is required.
        (3) Save all resources and replace them with the resources
            allocated by LKCD.

       [Implementation of each resource] 
            
        Case 1. The resources of the "struct request" are modified.
          (1) During the configuration of dump system (dump_open_kdev), 
              allocate the resources.
          (2) Before dump I/O, reset the device driver.
          (3) Save all request queue linked in the device driver.
          (4) Replace the request queue with the allocated resources.
          (5) Kernel I/O manages these ones.

        Case 2. The resources of the "struct scsi_cmnd" are modified.
          (1) During the configuration of dump system (dump_open_kdev), 
              allocate the resources.
          (2) Before dump I/O, reset the device driver.
          (3) Save all scsi_cmnd queue linked in scsi driver.
          (4) Replace the scsi_cmnd queue with the allocated resources.
          (5) SCSI driver manages these ones.

        Case 3. The resources of the "struct request" and the 
                "struct scsi_cmnd" are modified.
                   
          Case3 is implemented by the ideas for Case1, 2. 


best regards, 

================================================================
 Kenichi Matsuoka                  2nd Engineering Department 
 matsuoka@css1.kbnes.nec.co.jp     Computers Software Division
 (k-matsuoka@pd.jp.nec.com)        2nd Operations Unit
 Tel +81 78-991-5578               NEC System Technologies, Ltd. 
 FAX +81 78-992-5080
================================================================



From lkcd-general-owner@lists.sourceforge.net Wed Nov 28 21:55:17 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169KAD-00026D-00
	for <lkcd-general@lists.sourceforge.net>; Wed, 28 Nov 2001 21:55:13 -0800
Received: from TYO202.gate.nec.co.jp (TYO202.gate.nec.co.jp [202.247.6.41])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAT6tCo24380
	for <lkcd@oss.sgi.com>; Wed, 28 Nov 2001 22:55:12 -0800
Received: from mailgate4.nec.co.jp ([10.7.69.195])
	by TYO202.gate.nec.co.jp (8.11.6/3.7W01080315) with ESMTP id fAT5svO22081;
	Thu, 29 Nov 2001 14:54:57 +0900 (JST)
Received: from mailsv4.nec.co.jp (mailgate51.nec.co.jp [10.7.69.190]) by mailgate4.nec.co.jp (8.11.6/3.7W-MAILGATE-NEC) with ESMTP
	id fAT5sra28105; Thu, 29 Nov 2001 14:54:53 +0900 (JST)
Received: from sv3l.kbnes.nec.co.jp (mailsv1.kbnes.nec.co.jp [10.108.80.233]) by mailsv4.nec.co.jp (8.11.6/3.7W-MAILSV4-NEC) with ESMTP
	id fAT5sqi24358; Thu, 29 Nov 2001 14:54:52 +0900 (JST)
Received: from phoebe.css1.kbnes.nec.co.jp ([172.16.32.5])
	by sv3l.kbnes.nec.co.jp (8.11.2/3.7W/01/24/01) with ESMTP id fAT5spd32806;
	Thu, 29 Nov 2001 14:54:51 +0900 (JST)
Received: from dragoon.css1.kbnes.nec.co.jp (dragoon [172.16.32.8])
	by phoebe.css1.kbnes.nec.co.jp (8.8.8+2.7Wbeta7/3.6W-nomx) with ESMTP id GAA21043;
	Thu, 29 Nov 2001 06:01:49 GMT
Received: from math2 ([172.16.28.219]) by dragoon.css1.kbnes.nec.co.jp (8.8.8+2.7Wbeta7/3.5Wpl7-nomx) with ESMTP id OAA29103; Thu, 29 Nov 2001 14:54:47 +0900 (JST)
From: Ken-ichi Matsuoka <matsuoka@css1.kbnes.nec.co.jp>
To: lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net
Cc: r-shibano@pb.jp.nec.com
Message-Id: <20011129144552.A43B.MATSUOKA@css1.kbnes.nec.co.jp>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-Mailer: Becky! ver. 2.00.07
Subject: [lkcd-general] [Proposal] dump dedicated driver with polling method
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Wed Nov 28 21:56:02 2001
X-Original-Date: Thu, 29 Nov 2001 14:54:13 +0900

Hi, all

To dump the memory image when crashing on the badly damaged system, 
we think reliable dump I/O is required at first.

But current LKCD unfortunately cannot dump the memory image when crashing, 
so we cannot investigate crash dump. (The detail is below.)

To solve this issue, we propose the dump dedicated driver, which uses polling 
method, manages necessary buffers by itself.

Any comment is welcomed.


0. Table of Contents
  1. The abstract of proposal
    1.1. Purpose
    1.2. LKCD's problem
    1.3. Our proposal
    1.4. How to proceed
    1.5. Schedule

  2. The details of problems
    2.1. The case of the cause 1
    2.2. The case of the cause 2
  3. The details of proposal
    3-1. The idea to the cause 1
    3-2. The idea to the cause 2


The following is the abstract of proposal:

1. The abstract of proposal
  1.1. The purpose
       To dump the memory image when crashing on the badly damaged system, 
       We aim the following:
 
         We advance LKCD's implementation so that LKCD can dump the memory 
         image in any status of the system. 

  1.2. LKCD's problem

       There are the following cases that we cannot investigate crash dump:

       [Issue]
          LKCD cannot dump the memory image when crashing, so we cannot
          investigate crash dump.
  
       The cause of this issue is linked to below.

       (Cause 1) LKCD doesn't disable the interruption during dump I/O, 
                 so the interrupt handlers of other drivers are processed.
                 The interrupt handler modifies kernel resources during 
                 dump I/O.

       (Cause 2) LKCD uses original raw I/O.
                 Raw I/O uses kernel resources, so the following happens:

                 (A) Kernel resources are modified during dump I/O.
                 (B) If the dumped device has already had I/O requests, 
                     these requests are processed during dump I/O.

  1.3. Our proposal
       The followings are our ideas to implement above causes:
       
         (Against 1) dump driver with polling method 
         (Against 2) LKCD personally allocates the resources of I/O requests 
                     and manages them by replacing the original ones.

       To do, We propose dump dedicated driver, which uses polling method,
       manages necessary buffers by itself.


  1.4. How to proceed
       We proceed the following:
         
       1. Driver Developer develops the dedicated driver by chip.
          To fix the interface between the dump upper driver
          and the dedicated driver by chip is required.
          At first, We make the specification draft.
       
       2. We provide the specification draft to LKCD-ML, 
          discuss  with you, and fix it.

  1.5. Schedule
       At the end of Dec. 2001 Specification Draft 
                               : We provide the specification draft to
                                 LKCD-ML.
       At the end of Apr. 2002 Specification Rev. 1
                               : Fix specification with you
                             
2. The details of problems
  2.1. The case of the cause 1
       (Modified kernel resources in the interruption processing)

       Case1-1. Timer list are modified.
              (What happens) 
                Timer routine is called by timer_bh which is the bottom half
                handler of timer interruption.
                And timer function of the device driver is processed in timer
                routine, so the status or control flags of device driver are 
                modified by the timer function.
              
              (What troubles)
                The status or control flags of device driver are modified, so
                you cannot grasp the status of the device driver at the time 
                of software failure and cannot investigate the cause of it.

                For example:
                  When the network driver has a trouble, the status of
                  network driver is modified during dump I/O.
                  So you cannot investigate.
                  
       Case1-2. The resources of the device driver which uses software 
                interrupt handler are modified.

              (What happens) 
                Software interrupt handler is called, so the resources of
                the device driver using it are modified. 

              (What troubles)
                You cannot investigate the resources of the device driver
                using software interrupt handler.

                For example, when software interrupt handler for network 
                receiving routine is triggered, the receive routine is 
                processed, and the status of socket are modified.
                So you cannot investigate them.

  2.2. The case of the cause 2
       (Modified kernel resources in the raw I/O processing)

       2.2.1. The case of the cause 2-(A)

       Case2-1. The resources of the "struct request" are modified.

              (What happens) 
                 These resources are allocated by the device driver 
                 at the initial phase, and are managed by Kernel block I/O.
                 But dump I/O processes the following processes, so these ones
                 are modified.

                   By dump I/O, the "struct request" is taken from free 
                   request queue (in get_request), and this one is 
                   enqueued to the request queue of the device driver
                   (in add_request). So these request queues are modified.
                    
              (What troubles)
                 At the error of I/O routine, you cannot know what request is
                 the current request and what the status of request is. 

       Case2-2. The resources of the "struct scsi_cmnd" are modified.
              (What happens) 
                 These resources are allocated by SCSI driver at the 
                 initializing phase, and are managed by SCSI driver.
                 But dump I/O processes the following processes, so these ones 
                 are modified.
                   
                   After enqueuing request by dump I/O, the "struct scsi_cmnd"
                   is taken from free list (in scsi_allocate_device),
                   and is dispatched to low-level driver (in scsi_dispatch_cmd).
                   So the resources of the "struct scsi_cmnd" are modified.
 
              (What troubles)
                  At the error of I/O routine, you cannot know what command is
                  processed and what the status of the command is.

       2.2.2. The case of the cause 2-(B)
  
       Case2-3. The resources of the "struct request" and the 
                "struct scsi_cmnd" are modified.
              (What happens) 
                 If the dumped device has already had I/O requests, that is, 
                 if the dumped device has already enqueued some requests, 
                 these ones are processed by dump I/O.
  
                 For example, in the case that the disk connected in H/W raid 
                 has I/O trouble, if you want to dump memory to SCSI disk, 
                 the requests enqueued in SCSI disk has processed.             

              (What troubles)
                 By dump I/O, the status of requests are modified, you cannot
                 investigate the status of requests when crashing.


3. The details of proposal

We propose the following ideas in order not to happen these cases:

  3-1. The idea to the cause 1

       [Proposal] dump driver with polling method
         
       The background of this proposal is below.

       To implement the cause 1, the I/O method which doesn't modify
       kernel resources in interrupt processing is required.
       We have considered 5 methods:

        (1) Disable all interruption by the processor function,
            such as clear interrupt flag (cli) routine on IA-32.
        (2) Disable all interruption by the interrupt controller.
        (3) Disable the interruption except for timer interruption 
            and the interruption for dump I/O.
        (4) In addition to (3), remove the timer routine 
            and software interrupt handler which are independent
            of dump I/O.
        (5) Use of polling mechanism, Disable all interruption

       [The advantage and fault of each method]
        (1) You cannot disable the interruption, because kernel
            or the device driver has the routine which enables 
            the interruption.
            That is, Modification of kernel resources happens.
        (2) Modification of kernel resources doesn't happen.
            But you cannot execute waiting routine of I/O 
            completion and the routine of I/O timeout, because
            all interruption is disabled.
        (3) Modification of kernel resources happens, because
            timer routine and software interrupt handler are
            called.
        (4) Modification of kernel resources doesn't happen.
            To do this method, you need the following processes, 
            But these processes are unreal.
            
            (A) Disable timer routine except for dump I/O routine
            (B) Disable software interrupt handlers except for 
                timer_bh and scsi_bottom_half_handler

        (5) Above problems doesn't happen, because in this method,
            dedicated driver manage all I/O routine such as
            waiting routine of I/O completion, the routine of
            I/O timeout. 
            
       [The result of our consideration]
        To implement the cause 1, I propose the method (5).


  3-2. The idea to the cause 2
       The following idea is the implementation against the cause 2:

       [Proposal] LKCD personally allocates the resources of I/O requests and 
                  manages them by replacing the original ones.
 
       The background of this proposal is below.
       
       To implement the cause 2, not to use kernel resources is required.
       
       We consider the following idea:

       [How to proceed]
        Replace the original resources linked in the device driver 
        with the resources allocated by LKCD.

       [To do]
        The following processes are required to change resources:

        (1) During the configuration of dump system, allocate the resources
            which will be replaced original resources with. 
        (2) The queue linked in the device driver (such as request)
            is depended on the state of chip very much, because 
            not only Kernel I/O but also the device driver uses 
            this queue. 
            But it is difficult to grasp the state of the chip.
            So to reset device driver before dump I/O is required.
        (3) Save all resources and replace them with the resources
            allocated by LKCD.

       [Implementation of each resource] 
            
        Case 1. The resources of the "struct request" are modified.
          (1) During the configuration of dump system (dump_open_kdev), 
              allocate the resources.
          (2) Before dump I/O, reset the device driver.
          (3) Save all request queue linked in the device driver.
          (4) Replace the request queue with the allocated resources.
          (5) Kernel I/O manages these ones.

        Case 2. The resources of the "struct scsi_cmnd" are modified.
          (1) During the configuration of dump system (dump_open_kdev), 
              allocate the resources.
          (2) Before dump I/O, reset the device driver.
          (3) Save all scsi_cmnd queue linked in scsi driver.
          (4) Replace the scsi_cmnd queue with the allocated resources.
          (5) SCSI driver manages these ones.

        Case 3. The resources of the "struct request" and the 
                "struct scsi_cmnd" are modified.
                   
          Case3 is implemented by the ideas for Case1, 2. 


best regards, 

================================================================
 Kenichi Matsuoka                  2nd Engineering Department 
 matsuoka@css1.kbnes.nec.co.jp     Computers Software Division
 (k-matsuoka@pd.jp.nec.com)        2nd Operations Unit
 Tel +81 78-991-5578               NEC System Technologies, Ltd. 
 FAX +81 78-992-5080
================================================================



From lkcd-general-owner@lists.sourceforge.net Thu Nov 29 00:36:12 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169Mfu-0001v4-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 29 Nov 2001 00:36:06 -0800
Received: from nakedeye.aparity.com (w032.z064001165.sjc-ca.dsl.cnc.net [64.1.165.32])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAT9Zvo06605
	for <lkcd@oss.sgi.com>; Thu, 29 Nov 2001 01:35:58 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAT8cZJ29493;
	Thu, 29 Nov 2001 00:38:35 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Vamsi Krishna S ." <vamsi@in.ibm.com>
cc: <lkcd@oss.sgi.com>, <lkcd-general@lists.sourceforge.net>,
   bharata <bharata@in.ibm.com>, suparna <bsuparna@in.ibm.com>,
   subodh <subodh@in.ibm.com>
Subject: Re: [lkcd-general] [PATCH]capturing registers/stack on all processors
In-Reply-To: <20011127143019.A8322@in.ibm.com>
Message-ID: <Pine.LNX.4.30.0111290020130.29286-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 29 00:37:02 2001
X-Original-Date: Thu, 29 Nov 2001 00:38:35 -0800 (PST)

Looks good, Vamsi.  A couple of points:

- You'll need to fix the cmds.c file to deal with the NULL argument
  at the end of the new commands (or it will conflict with Naomi-san's
  latest code)
- Why hw_irq.h?  I didn't see any CONFIG_DUMP stuff in there ...
- I think you need to add the printk("Dump "); for the default case and
  add CONFIG_DUMP_MODULE for the 'd' command in sysrq.c (unless there's
  a reason not to have it?

My only other concern, which isn't that big, is that someone will
complain if we try to add show_this_cpu_state() into 2.5, as it
is mostly duplicate code.

Great job, Vamsi.  Are all of you on #lkcd?

--Matt

On Tue, 27 Nov 2001, Vamsi Krishna S . wrote:
|>Hello,
|>
|>Here is a patch against lkcd cvs (as on 11/26/2001) for capturing
|>registers on all processors at the time of dumping.
|>
|>This has been found to be crucial to debug problems where some of
|>the cpus on an SMP are hung (executing a tight loop, interrupts
|>disabled).
|>
|>We send an NMI-class IPI to other cpus to capture the registers
|>and stack. This is the only guaranteed way to ensure that other
|>cpus respond. If they don't respond to NMI, there is absolutely
|>nothing we can do in software.
|>
|>We need to capture the stack, even though we would prefer not to.
|>The reason being that the stack could change between the time the
|>registers are captured and the time that page is written out in
|>the dumping process. The chages in the stack could be so
|>significant as to render backtracing impossible/totally inaccurate.
|>
|>Currently, all the changes we made are specific to i386, even
|>though many of the changes could have been arch-independent.
|>
|>Brief list of chages:
|>
|>kernel:
|>- extensions to dump_header_asm_t to add fields to capture:
|>	- smp_num_cpus and dumping_cpu
|>	- registers of all processors
|>	- pointers to current tasks
|>	- pointers to the location where stacks are saved
|>- remove __dump_save_panic_regs
|>- collect registers in panic()
|>- remove all use of dha_esp, dha_eip, dha_regs and use
|>  dha_smp_regs consistantly
|>- cleanup dump_configure_header handling, ie, do it only
|>  once in dump_execute
|>- send NMI to all processors and capture their registers,
|>  current task and kernel stack as part of
|>  __configure_dump_header
|>- [bonus] new magic sysrq key 'd' to show the registers
|>  and, backtrace if inside kernel, on all processors
|>- [side effect] as part of capturing registers on panic
|>  we now seem to be able to backtrace correctly in
|>  panic dump cases.
|>
|>lcrash:
|>- new commands
|>	- rd
|>	- defcpu
|>- rd to display registers captured at the time of taking the
|>  dump on the processor which is currently the defcpu
|>- defcpu to set the default cpu and set deftask to the current
|>  task on that cpu at the time of dump
|>- new kl_smp_dumptask to determine while backtracing if this
|>  task is a current task on any of the processors at the time
|>  of dump
|>- changes to kl_dumpesp/kl_dumpeip to get the esp/eip values
|>  from dha_smp_regs.
|>- changes to get_block() to look at the saved stack if this
|>  task is a current task on any of the processors and was
|>  inside the kernel when the dump was taken
|>- [unrelated bug fix] fix lkcd_config.c to pass the values of
|>  dump level, dump flags and compression_type instead of their
|>  addresses to the ioctl call to set them.
|>
|>
|>--
|>LKCD Team India
|>Linux Technology Center,
|>IBM Software Lab, Bangalore.
|>Ph: +91 80 5044959
|>Internet: vamsi@in.ibm.com
|>
|>--
|>
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c
|>--- lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c	Mon Sep 24 15:31:42 2001
|>+++ lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c	Mon Nov 26 14:03:33 2001
|>@@ -31,8 +31,6 @@
|>
|> extern void dump_thread(struct pt_regs *, struct user *);
|> extern spinlock_t rtc_lock;
|>-extern irq_desc_t irq_desc[];
|>-extern unsigned long irq_affinity[];
|>
|> #if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
|> extern void machine_real_restart(unsigned char *, int);
|>@@ -150,8 +148,6 @@
|> #endif
|>
|> EXPORT_SYMBOL(get_wchan);
|>-EXPORT_SYMBOL(irq_affinity);
|>-EXPORT_SYMBOL(irq_desc);
|>
|> EXPORT_SYMBOL(rtc_lock);
|>
|>@@ -164,4 +160,17 @@
|>
|> #ifdef CONFIG_X86_PAE
|> EXPORT_SYMBOL(empty_zero_page);
|>+#endif
|>+
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+extern irq_desc_t irq_desc[];
|>+extern unsigned long irq_affinity[];
|>+EXPORT_SYMBOL(irq_affinity);
|>+EXPORT_SYMBOL(irq_desc);
|>+#ifdef CONFIG_SMP
|>+extern void dump_send_ipi(void);
|>+EXPORT_SYMBOL(dump_send_ipi);
|>+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
|>+EXPORT_SYMBOL(dump_ipi_function_ptr);
|>+#endif
|> #endif
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c lkcd_cvs_new/2.4/arch/i386/kernel/smp.c
|>--- lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c	Tue Oct 16 12:51:44 2001
|>+++ lkcd_cvs_new/2.4/arch/i386/kernel/smp.c	Mon Nov 26 14:06:02 2001
|>@@ -142,6 +142,15 @@
|> 	 */
|> 	cfg = __prepare_ICR(shortcut, vector);
|>
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	if (vector == DUMP_VECTOR) {
|>+		/*
|>+		 * Setup DUMP IPI to be delivered as an NMI
|>+		 */
|>+		cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
|>+	}
|>+#endif	/* CONFIG_DUMP */
|>+
|> 	/*
|> 	 * Send the IPI. The write to APIC_ICR fires this off.
|> 	 */
|>@@ -424,6 +433,13 @@
|>
|> 	do_flush_tlb_all_local();
|> }
|>+
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+void dump_send_ipi(void)
|>+{
|>+	send_IPI_allbutself(DUMP_VECTOR);
|>+}
|>+#endif
|>
|> /*
|>  * this function sends a 'reschedule' IPI to another CPU.
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c lkcd_cvs_new/2.4/arch/i386/kernel/traps.c
|>--- lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c	Wed Sep 26 15:16:15 2001
|>+++ lkcd_cvs_new/2.4/arch/i386/kernel/traps.c	Mon Nov 26 16:46:48 2001
|>@@ -89,6 +89,105 @@
|>
|> int kstack_depth_to_print = 24;
|>
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+/*
|>+ * This code mimics show_trace() etc in arch/i386/kernel/traps.c. We don't
|>+ * use them directly as they depend on 8K aligned kernel stacks that our
|>+ * saved stacks don't satisfy. However, there is move to relax the requirement
|>+ * on task_struct to be 8K-aligned. Once that happens, we could simpify this
|>+ * function.
|>+ */
|>+void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk)
|>+{
|>+	int i;
|>+	unsigned long *esp;
|>+	unsigned char *c;
|>+	int in_kernel = 1;
|>+
|>+	esp = (unsigned long *)regs->esp;
|>+	c = (unsigned char *)regs->eip;
|>+
|>+	if (regs->xcs & 3) {
|>+		in_kernel = 0;
|>+	}
|>+	printk("CPU:    %d\nEIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
|>+		cpu, 0xffff & regs->xcs, regs->eip, regs->eflags);
|>+	printk("eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
|>+		regs->eax, regs->ebx, regs->ecx, regs->edx);
|>+	printk("esi: %08lx   edi: %08lx   ebp: %08lx   esp: %p\n",
|>+		regs->esi, regs->edi, regs->ebp, esp);
|>+	printk("ds: %04x   es: %04x   ss: %04x\n",
|>+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
|>+	if (!tsk) {
|>+		printk("no stack for this cpu\n");
|>+		return;
|>+	}
|>+	printk("Process %s (pid: %d, stackpage=%08lx)",
|>+		tsk->comm, tsk->pid, 4096+(regs->esp & ~(THREAD_SIZE-1)));
|>+	/*
|>+	 * When in-kernel, we also print out the stack and code at the
|>+	 * time of the fault..
|>+	 */
|>+	if (in_kernel) {
|>+		unsigned long *stack;
|>+		unsigned long addr, module_start, module_end;
|>+		extern char _stext, _etext;
|>+
|>+		extern int kstack_depth_to_print;
|>+
|>+		esp = (unsigned long *)((unsigned long)tsk + (regs->esp & (THREAD_SIZE-1)));
|>+
|>+		printk("\nStack: ");
|>+		stack = esp;
|>+		for(i=0; i < kstack_depth_to_print; i++) {
|>+			if ((unsigned long)stack > (unsigned long)tsk + THREAD_SIZE-1)
|>+				break;
|>+			if (i && ((i % 8) == 0))
|>+				printk("\n       ");
|>+			printk("%08lx ", *stack++);
|>+		}
|>+
|>+		printk("\nCall Trace: ");
|>+		i = 1;
|>+		stack = esp;
|>+		module_start = VMALLOC_START;
|>+		module_end = VMALLOC_END;
|>+		module_end = 0;
|>+		while ((unsigned long)stack < (unsigned long)tsk + THREAD_SIZE) {
|>+			addr = *stack++;
|>+			/*
|>+			 * If the address is either in the text segment of the
|>+			 * kernel, or in the region which contains vmalloc'ed
|>+			 * memory, it *may* be the address of a calling
|>+			 * routine; if so, print it so that someone tracing
|>+			 * down the cause of the crash will be able to figure
|>+			 * out the call path that was taken.
|>+			 */
|>+			if (((addr >= (unsigned long) &_stext) &&
|>+			     (addr <= (unsigned long) &_etext)) ||
|>+			    ((addr >= module_start) && (addr <= module_end))) {
|>+				if (i && ((i % 8) == 0))
|>+					printk("\n       ");
|>+				printk("[<%08lx>] ", addr);
|>+				i++;
|>+			}
|>+		}
|>+		printk("\n");
|>+
|>+		printk("\nCode: ");
|>+		if(regs->eip < PAGE_OFFSET) {
|>+			printk("eip in user space. error.\n");
|>+		}
|>+
|>+		for(i=0;i<20;i++) {
|>+			printk("%02x ", *c++);
|>+		}
|>+	}
|>+	printk("\n");
|>+	return;
|>+}
|>+#endif /* CONFIG_DUMP */
|>+
|> /*
|>  * These constants are for searching for possible module text
|>  * segments.
|>@@ -471,12 +570,33 @@
|> }
|> #endif
|>
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+#ifdef CONFIG_SMP
|>+int (*dump_ipi_function_ptr)(struct pt_regs *) = NULL;
|>+static int dump_ipi(struct pt_regs *regs)
|>+{
|>+	if (!(dump_ipi_function_ptr && dump_ipi_function_ptr(regs))) {
|>+		return 0;
|>+	}
|>+	ack_APIC_irq();
|>+	return 1;
|>+}
|>+#else
|>+#define dump_ipi(regs) 0
|>+#endif
|>+#endif
|>+
|> asmlinkage void do_nmi(struct pt_regs * regs, long error_code)
|> {
|> 	unsigned char reason = inb(0x61);
|>
|>
|> 	++nmi_count(smp_processor_id());
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	if (dump_ipi(regs)) {
|>+		return;
|>+	}
|>+#endif
|> 	if (!(reason & 0xc0)) {
|> #if CONFIG_X86_IO_APIC
|> 		/*
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/char/sysrq.c lkcd_cvs_new/2.4/drivers/char/sysrq.c
|>--- lkcd_cvs_orig/2.4/drivers/char/sysrq.c	Fri Nov 23 17:25:29 2001
|>+++ lkcd_cvs_new/2.4/drivers/char/sysrq.c	Tue Nov 27 14:01:47 2001
|>@@ -96,6 +96,15 @@
|> 		dump("sysrq", pt_regs);
|> 		break;
|> #endif
|>+#if defined(CONFIG_DUMP)
|>+	case 'd':
|>+		{
|>+		extern void show_cpu_state(struct pt_regs *);
|>+		printk("Show state of all cpus\n");
|>+		show_cpu_state(pt_regs);
|>+		break;
|>+		}
|>+#endif
|>
|> 	case 'o':					    /* O -- power off */
|> 		if (sysrq_power_off) {
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_base.c lkcd_cvs_new/2.4/drivers/dump/dump_base.c
|>--- lkcd_cvs_orig/2.4/drivers/dump/dump_base.c	Fri Nov 23 17:25:30 2001
|>+++ lkcd_cvs_new/2.4/drivers/dump/dump_base.c	Mon Nov 26 16:34:44 2001
|>@@ -268,14 +268,12 @@
|> extern struct new_utsname system_utsname;     /* system information        */
|>
|> /* external architecture-specific functions */
|>-extern void __dump_open(struct file *, uint64_t);
|>+extern void __dump_open(void);
|>+extern void __dump_cleanup(void);
|> extern void __dump_init(uint64_t);
|>-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
|>+extern int __dump_configure_header(struct pt_regs *);
|> extern unsigned int  __dump_silence_system(unsigned int);
|> extern unsigned int  __dump_resume_system(unsigned int);
|>-#ifdef CONFIG_X86
|>-extern void __dump_save_panic_regs(dump_header_asm_t *);
|>-#endif
|>
|> /* external functions                                                      */
|> extern void si_meminfo(struct sysinfo *);
|>@@ -736,7 +734,7 @@
|> 	}
|>
|> 	/* configure architecture-specific dump header values */
|>-	if (!__dump_configure_header(&dump_header_asm, regs)) {
|>+	if (!__dump_configure_header(regs)) {
|> 		return (0);
|> 	}
|> 	return (1);
|>@@ -792,16 +790,11 @@
|>  *       memory pages and dumps the data to disk (using other functions).
|>  */
|> static int
|>-dump_execute_memdump(char *panic_str, struct pt_regs *regs)
|>+dump_execute_memdump(void)
|> {
|> 	int counter = 0, state = 0;
|> 	unsigned long mem_loc, buf_loc;
|>
|>-	if (!dump_configure_header(panic_str, regs)) {
|>-		DUMP_PRINT("Dump header could not be configured!");
|>-		return (-1);
|>-	}
|>-
|> 	DUMP_PRINT("\nDump compression value is 0x%x ...", dump_compress);
|>
|> 	DUMP_PRINT("\nWriting dump header ...");
|>@@ -939,39 +932,23 @@
|> 		return;
|> 	}
|>
|>+	if(!dump_configure_header(panic_str, regs)) {
|>+		DUMP_PRINT("\ndump header could not be configured!");
|>+		return;
|>+	}
|>+
|> 	/* silence the system */
|> 	dump_silence_system();
|>
|> 	/* bail out if we're not going to do any dumping */
|> 	if (dump_level != DUMP_LEVEL_NONE) {
|> 		/* inform users of what we are about to do */
|>-#ifdef CONFIG_SMP
|> 		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
|> 			dump_device, bdevname(dump_device),
|> 			smp_processor_id());
|>-#else
|>-		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
|>-			dump_device, bdevname(dump_device),
|>-			0);
|>-#endif
|>
|> 		/* start walking through the page tables */
|>-		state = dump_execute_memdump(panic_str, regs);
|>-
|>-#ifdef CONFIG_X86
|>-		/*
|>-		 * Okay, this is REALLY annoying to have to
|>-		 * do.  What this means is that for x86
|>-		 * systems, we have to literally save the
|>-		 * esp/eip _now_, because we don't want the
|>-		 * esp/eip from dump_write_header() or
|>-		 * anything it calls to conflict with
|>-		 * re-building the panic() stack trace case.
|>-		 * So for that reason, we save the eip/esp
|>-		 * now so we can re-build the trace later.
|>-		 */
|>-		__dump_save_panic_regs(&dump_header_asm);
|>-#endif
|>+		state = dump_execute_memdump();
|>
|> 		/* update header to disk for the last time */
|> 		if (dump_write_header() < 0) {
|>@@ -1054,7 +1031,7 @@
|> 	struct list_head *tmp;
|> 	dump_compress_t *dc;
|>
|>-	/* try to remove the compression item */
|>+	/* try to set the compression type*/
|> 	list_for_each(tmp, &dump_compress_list) {
|> 		dc = list_entry(tmp, dump_compress_t, list);
|> 		if (dc->compress_type == compression_type) {
|>@@ -1210,6 +1187,7 @@
|> 			if (!(f->f_flags & O_RDWR)) {
|> 				return (-EPERM);
|> 			}
|>+			__dump_open();
|> 			return (dump_open_kdev((kdev_t)arg));
|>
|> 		/* get dump_device */
|>@@ -1423,6 +1401,9 @@
|> 	if (dump_page_buf) {
|> 		kfree((const void *)dump_page_buf);
|> 	}
|>+
|>+	/* arch-specific cleanup routine */
|>+	__dump_cleanup();
|>
|> 	/* remove the proc entries */
|> 	dump_proc_cleanup();
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c lkcd_cvs_new/2.4/drivers/dump/dump_i386.c
|>--- lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c	Thu Oct  4 14:19:49 2001
|>+++ lkcd_cvs_new/2.4/drivers/dump/dump_i386.c	Tue Nov 27 14:15:26 2001
|>@@ -21,50 +21,143 @@
|> #include <linux/kernel.h>
|> #include <linux/smp.h>
|> #include <linux/fs.h>
|>+#include <linux/vmalloc.h>
|> #include <linux/dump.h>
|> #include <linux/mm.h>
|> #include <asm/processor.h>
|> #include <asm/hardirq.h>
|> #include <linux/irq.h>
|>
|>-extern volatile int dump_in_progress;
|>-extern unsigned long irq_affinity[NR_IRQS];
|> static unsigned long saved_affinity[NR_IRQS];
|>
|>-/*
|>- * Name: __dump_save_panic_regs()
|>- * Func: Save the EIP (really the RA).  We may pass an argument later.
|>- * 	 Save ESP also here.
|>- */
|>-inline void
|>-__dump_save_panic_regs(dump_header_asm_t *dha)
|>-{
|>-	__asm__ __volatile__("movl  %%esp, %0\n"
|>-		: "=r" (dha->dha_esp));
|>-	/* hate to do this, but ... */
|>-#ifdef CONFIG_FRAME_POINTER
|>-	__asm__ __volatile__("movl  4(%%esp), %0\n"
|>-		: "=r" (dha->dha_eip));
|>+static int alloc_dha_stack(void)
|>+{
|>+	int i;
|>+	void *ptr;
|>+
|>+	if (dump_header_asm.dha_stack[0])
|>+		return 0;
|>+
|>+       	ptr = vmalloc(THREAD_SIZE * smp_num_cpus);
|>+	if (!ptr) {
|>+		printk("vmalloc for dha_stacks failed\n");
|>+		return -ENOMEM;
|>+	}
|>+
|>+	for (i = 0; i < smp_num_cpus; i++) {
|>+		dump_header_asm.dha_stack[i] = (void *)((unsigned long)ptr + (i * THREAD_SIZE));
|>+	}
|>+	return 0;
|>+}
|>+
|>+static int free_dha_stack(void)
|>+{
|>+	if (dump_header_asm.dha_stack[0])
|>+		vfree(dump_header_asm.dha_stack[0]);
|>+	return 0;
|>+}
|>+
|>+/* In case of panic dumps, we collects regs on entry to panic.
|>+ * so, we shouldn't 'fix' ssesp here again. But it is hard to
|>+ * tell just looking at regs whether ssesp need fixing. We make
|>+ * this decision by looking at xss in regs. If we have better
|>+ * means to determine that ssesp are valid (by some flag which
|>+ * tells that we are here due to panic dump), then we can use
|>+ * that instead of this kludge.
|>+ */
|>+static inline void
|>+fix_ssesp(struct pt_regs *regs, int cpu)
|>+{
|>+	if (!user_mode(regs)) {
|>+		if ((cpu == dump_header_asm.dha_dumping_cpu) &&
|>+			regs->xss == __KERNEL_DS)
|>+			return;
|>+		dump_header_asm.dha_smp_regs[cpu].esp =
|>+				(unsigned long)&(regs->esp);
|>+		__asm__ __volatile__ ("movw %%ss, %%ax;"
|>+			:"=a"(dump_header_asm.dha_smp_regs[cpu].xss));
|>+	}
|>+}
|>+
|>+static void
|>+save_this_cpu_state(int cpu, struct pt_regs *regs, struct task_struct *tsk)
|>+{
|>+	dump_header_asm.dha_smp_regs[cpu] = *regs;
|>+	dump_header_asm.dha_smp_current_task[cpu] = tsk;
|>+	fix_ssesp(regs, cpu);
|>+
|>+	if (dump_header_asm.dha_stack[cpu]) {
|>+		memcpy(dump_header_asm.dha_stack[cpu], tsk, THREAD_SIZE);
|>+	}
|>+	return;
|>+}
|>+
|>+#ifdef CONFIG_SMP
|>+static int dump_expect_ipi[NR_CPUS];
|>+static atomic_t waiting_for_dump_ipi;
|>+static int wait_for_dump_ipi = 1; /* always wait for ipi to to be handled */
|>+
|>+static int
|>+dump_ipi_handler(struct pt_regs *regs)
|>+{
|>+	int cpu = smp_processor_id();
|>+
|>+	if (!dump_expect_ipi[cpu]) {
|>+		return 0;
|>+	}
|>+
|>+	save_this_cpu_state(cpu, regs, current);
|>+
|>+	dump_expect_ipi[cpu] = 0;
|>+	atomic_dec(&waiting_for_dump_ipi);
|>+	return 1;
|>+}
|>+
|>+/* save registers on other processors */
|>+void
|>+save_other_cpu_states(void)
|>+{
|>+	int i;
|>+
|>+	if (smp_num_cpus > 1) {
|>+		atomic_set(&waiting_for_dump_ipi, smp_num_cpus-1);
|>+		for (i = 0; i < NR_CPUS; i++)
|>+			dump_expect_ipi[i] = 1;
|>+
|>+		dump_ipi_function_ptr = dump_ipi_handler;
|>+		dump_send_ipi();
|>+		/* may be we dont need to wait for NMI to be processed.
|>+		   just write out the header at the end of dumping, if
|>+		   this IPI is not processed untill then, there probably
|>+		   is a problem and we just fail to capture state of
|>+		   other cpus. */
|>+		if (wait_for_dump_ipi) {
|>+			while(atomic_read(&waiting_for_dump_ipi))
|>+				barrier();
|>+			dump_ipi_function_ptr = NULL;
|>+		}
|>+	}
|>+	return;
|>+}
|> #else
|>-	__asm__ __volatile__("movl  (%%esp), %0\n"
|>-		: "=r" (dha->dha_eip));
|>+#define save_other_cpu_states()
|> #endif
|>-}
|>
|> /*
|>  * Name: __dump_configure_header()
|>  * Func: Configure the dump header with all proper values.
|>  */
|> int
|>-__dump_configure_header(dump_header_asm_t *dha, struct pt_regs *regs)
|>+__dump_configure_header(struct pt_regs *regs)
|> {
|>-	/* save the dump specific esp/eip */
|>-	__dump_save_panic_regs(dha);
|>+	int cpu = smp_processor_id();
|>
|>-	/* one final check -- modify if we're in user mode */
|>-	if ((regs) && (!user_mode(regs))) {
|>-		dha->dha_regs.esp = (unsigned long) &(regs->esp);
|>-	}
|>+	dump_header_asm.dha_smp_num_cpus = smp_num_cpus;
|>+	dump_header_asm.dha_dumping_cpu = cpu;
|>+
|>+	save_this_cpu_state(cpu, regs, current);
|>+
|>+	save_other_cpu_states();
|>
|> 	return (1);
|> }
|>@@ -87,13 +180,28 @@
|>  *       case it's necessary in the future.
|>  */
|> void
|>-__dump_open(struct file *dump_file, uint64_t memory_size)
|>+__dump_open(void)
|> {
|>+	alloc_dha_stack();
|> 	/* return */
|> 	return;
|> }
|>
|> /*
|>+ * Name: __dump_cleanup()
|>+ * Func: Free any architecture specific data structures. This is called
|>+ *       when the dump module is being removed.
|>+ */
|>+void
|>+__dump_cleanup(void)
|>+{
|>+	free_dha_stack();
|>+	/* return */
|>+	return;
|>+}
|>+
|>+#ifdef CONFIG_SMP
|>+/*
|>  * Non dumping cpus will spin here. If a cpu is handling an irq when ipi is
|>  * received, we let go of it here while making sure that it hits schedule
|>  * on the way up and make it spin there instead.
|>@@ -108,6 +216,7 @@
|> 	}
|> 	return;
|> }
|>+#endif
|>
|> /*
|>  * Routine to save the old irq affinities and change affinities of all irqs to
|>@@ -179,4 +288,24 @@
|>
|> 	/* return */
|> 	return (0);
|>+}
|>+
|>+/* located in arch/i386/kernel/traps.c */
|>+extern void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk);
|>+
|>+void
|>+show_cpu_state(struct pt_regs * regs)
|>+{
|>+	int cpu = smp_processor_id();
|>+	int i;
|>+
|>+	__dump_configure_header(regs);
|>+
|>+	printk("__dump_configure_header done from cpu %d\n", cpu);
|>+
|>+	for (i = 0; i < smp_num_cpus; i++) {
|>+		show_this_cpu_state(i, dump_header_asm.dha_smp_regs[i], dump_header_asm.dha_stack[i]);
|>+	}
|>+
|>+	return;
|> }
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/dump.h lkcd_cvs_new/2.4/include/asm-i386/dump.h
|>--- lkcd_cvs_orig/2.4/include/asm-i386/dump.h	Wed Sep 26 15:21:38 2001
|>+++ lkcd_cvs_new/2.4/include/asm-i386/dump.h	Mon Nov 26 14:11:49 2001
|>@@ -14,6 +14,7 @@
|>
|> /* necessary header files */
|> #include <asm/ptrace.h>                          /* for pt_regs             */
|>+#include <linux/threads.h>
|>
|> /* definitions */
|> #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
|>@@ -45,6 +46,44 @@
|> 	/* the dump registers */
|> 	struct pt_regs       dha_regs;
|>
|>+	/* smp specific */
|>+	uint32_t	     dha_smp_num_cpus;
|>+	int		     dha_dumping_cpu;
|>+	struct pt_regs	     dha_smp_regs[NR_CPUS];
|>+	void *		     dha_smp_current_task[NR_CPUS];
|>+	void *		     dha_stack[NR_CPUS];
|> } dump_header_asm_t;
|>+
|>+#ifdef __KERNEL__
|>+static inline void get_current_regs(struct pt_regs *regs)
|>+{
|>+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
|>+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
|>+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
|>+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
|>+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
|>+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
|>+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
|>+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
|>+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
|>+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
|>+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
|>+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
|>+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
|>+	regs->eip = (unsigned long)current_text_addr();
|>+
|>+}
|>+
|>+extern volatile int dump_in_progress;
|>+extern unsigned long irq_affinity[];
|>+extern dump_header_asm_t dump_header_asm;
|>+
|>+#ifdef CONFIG_SMP
|>+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
|>+extern void dump_send_ipi(void);
|>+#else
|>+#define dump_send_ipi()
|>+#endif
|>+#endif /* __KERNEL__ */
|>
|> #endif /* _ASM_DUMP_H */
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h
|>--- lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h	Thu Jan  1 05:30:00 1970
|>+++ lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h	Mon Nov 26 14:13:23 2001
|>@@ -0,0 +1,226 @@
|>+#ifndef _ASM_HW_IRQ_H
|>+#define _ASM_HW_IRQ_H
|>+
|>+/*
|>+ *	linux/include/asm/hw_irq.h
|>+ *
|>+ *	(C) 1992, 1993 Linus Torvalds, (C) 1997 Ingo Molnar
|>+ *
|>+ *	moved some of the old arch/i386/kernel/irq.h to here. VY
|>+ *
|>+ *	IRQ/IPI changes taken from work by Thomas Radke
|>+ *	<tomsoft@informatik.tu-chemnitz.de>
|>+ */
|>+
|>+#include <linux/config.h>
|>+#include <asm/atomic.h>
|>+#include <asm/irq.h>
|>+
|>+/*
|>+ * IDT vectors usable for external interrupt sources start
|>+ * at 0x20:
|>+ */
|>+#define FIRST_EXTERNAL_VECTOR	0x20
|>+
|>+#define SYSCALL_VECTOR		0x80
|>+
|>+/*
|>+ * Vectors 0x20-0x2f are used for ISA interrupts.
|>+ */
|>+
|>+/*
|>+ * Special IRQ vectors used by the SMP architecture, 0xf0-0xff
|>+ *
|>+ *  some of the following vectors are 'rare', they are merged
|>+ *  into a single vector (CALL_FUNCTION_VECTOR) to save vector space.
|>+ *  TLB, reschedule and local APIC vectors are performance-critical.
|>+ *
|>+ *  Vectors 0xf0-0xfa are free (reserved for future Linux use).
|>+ */
|>+#define SPURIOUS_APIC_VECTOR	0xff
|>+#define ERROR_APIC_VECTOR	0xfe
|>+#define INVALIDATE_TLB_VECTOR	0xfd
|>+#define RESCHEDULE_VECTOR	0xfc
|>+#define CALL_FUNCTION_VECTOR	0xfb
|>+#define DUMP_VECTOR		0xfa
|>+
|>+/*
|>+ * Local APIC timer IRQ vector is on a different priority level,
|>+ * to work around the 'lost local interrupt if more than 2 IRQ
|>+ * sources per level' errata.
|>+ */
|>+#define LOCAL_TIMER_VECTOR	0xef
|>+
|>+/*
|>+ * First APIC vector available to drivers: (vectors 0x30-0xee)
|>+ * we start at 0x31 to spread out vectors evenly between priority
|>+ * levels. (0x80 is the syscall vector)
|>+ */
|>+#define FIRST_DEVICE_VECTOR	0x31
|>+#define FIRST_SYSTEM_VECTOR	0xef
|>+
|>+extern int irq_vector[NR_IRQS];
|>+#define IO_APIC_VECTOR(irq)	irq_vector[irq]
|>+
|>+/*
|>+ * Various low-level irq details needed by irq.c, process.c,
|>+ * time.c, io_apic.c and smp.c
|>+ *
|>+ * Interrupt entry/exit code at both C and assembly level
|>+ */
|>+
|>+extern void mask_irq(unsigned int irq);
|>+extern void unmask_irq(unsigned int irq);
|>+extern void disable_8259A_irq(unsigned int irq);
|>+extern void enable_8259A_irq(unsigned int irq);
|>+extern int i8259A_irq_pending(unsigned int irq);
|>+extern void make_8259A_irq(unsigned int irq);
|>+extern void init_8259A(int aeoi);
|>+extern void FASTCALL(send_IPI_self(int vector));
|>+extern void init_VISWS_APIC_irqs(void);
|>+extern void setup_IO_APIC(void);
|>+extern void disable_IO_APIC(void);
|>+extern void print_IO_APIC(void);
|>+extern int IO_APIC_get_PCI_irq_vector(int bus, int slot, int fn);
|>+extern void send_IPI(int dest, int vector);
|>+
|>+extern unsigned long io_apic_irqs;
|>+
|>+extern atomic_t irq_err_count;
|>+extern atomic_t irq_mis_count;
|>+
|>+extern char _stext, _etext;
|>+
|>+#define IO_APIC_IRQ(x) (((x) >= 16) || ((1<<(x)) & io_apic_irqs))
|>+
|>+#define __STR(x) #x
|>+#define STR(x) __STR(x)
|>+
|>+#define SAVE_ALL \
|>+	"cld\n\t" \
|>+	"pushl %es\n\t" \
|>+	"pushl %ds\n\t" \
|>+	"pushl %eax\n\t" \
|>+	"pushl %ebp\n\t" \
|>+	"pushl %edi\n\t" \
|>+	"pushl %esi\n\t" \
|>+	"pushl %edx\n\t" \
|>+	"pushl %ecx\n\t" \
|>+	"pushl %ebx\n\t" \
|>+	"movl $" STR(__KERNEL_DS) ",%edx\n\t" \
|>+	"movl %edx,%ds\n\t" \
|>+	"movl %edx,%es\n\t"
|>+
|>+#define IRQ_NAME2(nr) nr##_interrupt(void)
|>+#define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr)
|>+
|>+#define GET_CURRENT \
|>+	"movl %esp, %ebx\n\t" \
|>+	"andl $-8192, %ebx\n\t"
|>+
|>+/*
|>+ *	SMP has a few special interrupts for IPI messages
|>+ */
|>+
|>+	/* there is a second layer of macro just to get the symbolic
|>+	   name for the vector evaluated. This change is for RTLinux */
|>+#define BUILD_SMP_INTERRUPT(x,v) XBUILD_SMP_INTERRUPT(x,v)
|>+#define XBUILD_SMP_INTERRUPT(x,v)\
|>+asmlinkage void x(void); \
|>+asmlinkage void call_##x(void); \
|>+__asm__( \
|>+"\n"__ALIGN_STR"\n" \
|>+SYMBOL_NAME_STR(x) ":\n\t" \
|>+	"pushl $"#v"\n\t" \
|>+	SAVE_ALL \
|>+	SYMBOL_NAME_STR(call_##x)":\n\t" \
|>+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
|>+	"jmp ret_from_intr\n");
|>+
|>+#define BUILD_SMP_TIMER_INTERRUPT(x,v) XBUILD_SMP_TIMER_INTERRUPT(x,v)
|>+#define XBUILD_SMP_TIMER_INTERRUPT(x,v) \
|>+asmlinkage void x(struct pt_regs * regs); \
|>+asmlinkage void call_##x(void); \
|>+__asm__( \
|>+"\n"__ALIGN_STR"\n" \
|>+SYMBOL_NAME_STR(x) ":\n\t" \
|>+	"pushl $"#v"\n\t" \
|>+	SAVE_ALL \
|>+	"movl %esp,%eax\n\t" \
|>+	"pushl %eax\n\t" \
|>+	SYMBOL_NAME_STR(call_##x)":\n\t" \
|>+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
|>+	"addl $4,%esp\n\t" \
|>+	"jmp ret_from_intr\n");
|>+
|>+#define BUILD_COMMON_IRQ() \
|>+asmlinkage void call_do_IRQ(void); \
|>+__asm__( \
|>+	"\n" __ALIGN_STR"\n" \
|>+	"common_interrupt:\n\t" \
|>+	SAVE_ALL \
|>+	"pushl $ret_from_intr\n\t" \
|>+	SYMBOL_NAME_STR(call_do_IRQ)":\n\t" \
|>+	"jmp "SYMBOL_NAME_STR(do_IRQ));
|>+
|>+/*
|>+ * subtle. orig_eax is used by the signal code to distinct between
|>+ * system calls and interrupted 'random user-space'. Thus we have
|>+ * to put a negative value into orig_eax here. (the problem is that
|>+ * both system calls and IRQs want to have small integer numbers in
|>+ * orig_eax, and the syscall code has won the optimization conflict ;)
|>+ *
|>+ * Subtle as a pigs ear.  VY
|>+ */
|>+
|>+#define BUILD_IRQ(nr) \
|>+asmlinkage void IRQ_NAME(nr); \
|>+__asm__( \
|>+"\n"__ALIGN_STR"\n" \
|>+SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \
|>+	"pushl $"#nr"-256\n\t" \
|>+	"jmp common_interrupt");
|>+
|>+extern unsigned long prof_cpu_mask;
|>+extern unsigned int * prof_buffer;
|>+extern unsigned long prof_len;
|>+extern unsigned long prof_shift;
|>+
|>+/*
|>+ * x86 profiling function, SMP safe. We might want to do this in
|>+ * assembly totally?
|>+ */
|>+static inline void x86_do_profile (unsigned long eip)
|>+{
|>+	if (!prof_buffer)
|>+		return;
|>+
|>+	/*
|>+	 * Only measure the CPUs specified by /proc/irq/prof_cpu_mask.
|>+	 * (default is all CPUs.)
|>+	 */
|>+	if (!((1<<smp_processor_id()) & prof_cpu_mask))
|>+		return;
|>+
|>+	eip -= (unsigned long) &_stext;
|>+	eip >>= prof_shift;
|>+	/*
|>+	 * Don't ignore out-of-bounds EIP values silently,
|>+	 * put them into the last histogram slot, so if
|>+	 * present, they will show up as a sharp peak.
|>+	 */
|>+	if (eip > prof_len-1)
|>+		eip = prof_len-1;
|>+	atomic_inc((atomic_t *)&prof_buffer[eip]);
|>+}
|>+
|>+#ifdef CONFIG_SMP /*more of this file should probably be ifdefed SMP */
|>+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {
|>+	if (IO_APIC_IRQ(i))
|>+		send_IPI_self(IO_APIC_VECTOR(i));
|>+}
|>+#else
|>+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {}
|>+#endif
|>+
|>+#endif /* _ASM_HW_IRQ_H */
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/kernel/panic.c lkcd_cvs_new/2.4/kernel/panic.c
|>--- lkcd_cvs_orig/2.4/kernel/panic.c	Tue Oct 16 12:51:46 2001
|>+++ lkcd_cvs_new/2.4/kernel/panic.c	Mon Nov 26 17:33:33 2001
|>@@ -56,6 +56,10 @@
|> #if defined(CONFIG_ARCH_S390)
|>         unsigned long caller = (unsigned long) __builtin_return_address(0);
|> #endif
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	struct pt_regs regs;
|>+	get_current_regs(&regs);
|>+#endif
|>
|> 	va_start(args, fmt);
|> 	vsprintf(buf, fmt, args);
|>@@ -78,7 +82,9 @@
|>
|> 	notifier_call_chain(&panic_notifier_list, 0, NULL);
|>
|>-	dump(buf, NULL);
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	dump(buf, &regs);
|>+#endif
|>
|> 	if (panic_timeout > 0)
|> 	{
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile	Fri Jan 26 02:42:01 2001
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile	Tue Nov 27 13:08:26 2001
|>@@ -8,7 +8,7 @@
|> include $(DEPTH)/commondefs
|>
|> TARGETS   = $(DEPTH)/libarch.a
|>-CFILES    = i386_cmds.c cmd_mktrace.c
|>+CFILES    = i386_cmds.c cmd_mktrace.c cmd_rd.c cmd_defcpu.c
|> OFILES    = $(CFILES:.c=.o)
|>
|> all: default
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Thu Jan  1 05:30:00 1970
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Tue Nov 27 13:30:01 2001
|>@@ -0,0 +1,88 @@
|>+#include <lcrash.h>
|>+
|>+extern int get_dump_header_asm(dump_header_asm_t *);
|>+
|>+int defcpu = -1;
|>+
|>+/*
|>+ * deftask_cmd() -- Run the 'deftask' command.
|>+ */
|>+int
|>+defcpu_cmd(command_t *cmd)
|>+{
|>+	dump_header_asm_t dha;
|>+	int cpu;
|>+
|>+	if (cmd->nargs == 0) {
|>+		if (defcpu == -1) {
|>+			fprintf(cmd->efp, "No default cpu set\n");
|>+		} else {
|>+			fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
|>+		}
|>+		return(0);
|>+	}
|>+
|>+	if (MIP->core_type != reg_core) {
|>+		fprintf(cmd->efp, "Can't use this command on live system\n");
|>+		return (1);
|>+	}
|>+	if (get_dump_header_asm(&dha))
|>+		return (1);
|>+
|>+	cpu = strtol(cmd->args[0], NULL, 10);
|>+
|>+	if (cpu >= dha.dha_smp_num_cpus) {
|>+		fprintf(cmd->efp, "Error setting defcpu to %s\n", cmd->args[0]);
|>+		return (1);
|>+	}
|>+	defcpu = cpu;
|>+	fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
|>+
|>+	if (dha.dha_stack[defcpu]) {
|>+		deftask = (kaddr_t)dha.dha_smp_current_task[defcpu];
|>+		fprintf(cmd->ofp, "Default task is 0x%x\n", deftask);
|>+	}
|>+	return (0);
|>+}
|>+
|>+#define _DEFCPU_USAGE	"[-w outfile] [cpu]"
|>+
|>+/*
|>+ * defcpu_usage() -- Print the usage string for the 'defcpu' command.
|>+ */
|>+void
|>+defcpu_usage(command_t *cmd)
|>+{
|>+	CMD_USAGE(cmd, _DEFCPU_USAGE);
|>+}
|>+
|>+/*
|>+ * defcpu_help() -- Print the help information for the 'defcpu' command.
|>+ */
|>+void
|>+defcpu_help(command_t *cmd)
|>+{
|>+	CMD_HELP(cmd, _DEFCPU_USAGE,
|>+	"Set the default cpu if one is indicated. Otherwise print the "
|>+	"value of default cpu."
|>+        "When 'lcrash' is run on a live system, defcpu has no "
|>+        "meaning.\n\n"
|>+	"This command also sets the default task to the task running "
|>+	"on the default cpu at the time the dump is taken. "
|>+	"The rd command will display the registers on the default cpu "
|>+	"at the time the dump is taken. "
|>+        "The trace command will display a trace wrt the task "
|>+        "running on the default cpu at the time the dump is taken. ");
|>+}
|>+
|>+/*
|>+ * defcpu_parse() -- Parse the command line arguments for 'defcpu'.
|>+ */
|>+int
|>+defcpu_parse(command_t *cmd)
|>+{
|>+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
|>+		return(1);
|>+	}
|>+	return(0);
|>+}
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Thu Jan  1 05:30:00 1970
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Mon Nov 26 16:43:31 2001
|>@@ -0,0 +1,65 @@
|>+#include <lcrash.h>
|>+
|>+extern int get_dump_header_asm(dump_header_asm_t *dump_header_asm);
|>+extern int defcpu;
|>+
|>+#define _RD_USAGE "[-w outfile]"
|>+
|>+void
|>+rd_usage(command_t *cmd)
|>+{
|>+	CMD_USAGE(cmd, _RD_USAGE);
|>+}
|>+
|>+void
|>+rd_help(command_t *cmd)
|>+{
|>+	CMD_HELP(cmd, _RD_USAGE,
|>+			"Display the register contents of the default cpu."
|>+			"This command can't be used on a live system ");
|>+}
|>+
|>+int
|>+rd_parse(command_t *cmd)
|>+{
|>+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
|>+		return(1);
|>+	}
|>+	return 0;
|>+}
|>+
|>+int
|>+rd_cmd(command_t *cmd)
|>+{
|>+	dump_header_asm_t dha;
|>+	struct pt_regs * regs;
|>+
|>+	if (cmd->nargs != 0) {
|>+		rd_usage(cmd);
|>+		return(1);
|>+	}
|>+
|>+	if (MIP->core_type != reg_core) {
|>+		fprintf(cmd->efp, "Can't use this command on live system\n");
|>+		return(1);
|>+	}
|>+
|>+	if (get_dump_header_asm(&dha))
|>+		return(1);
|>+
|>+	if (defcpu == -1)
|>+		defcpu = dha.dha_dumping_cpu;
|>+
|>+	regs = &dha.dha_smp_regs[defcpu];
|>+
|>+	fprintf(cmd->ofp, "CPU:    %d   EIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
|>+		defcpu, regs->xcs & 0xffff, regs->eip, regs->eflags);
|>+	fprintf(cmd->ofp, "eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
|>+		regs->eax, regs->ebx, regs->ecx, regs->edx);
|>+	fprintf(cmd->ofp, "esi: %08lx   edi: %08lx   ebp: %08lx   esp: %08lx\n",
|>+		regs->esi, regs->edi, regs->ebp, regs->esp);
|>+	fprintf(cmd->ofp, "ds: %04x   es: %04x   ss: %04x\n",
|>+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
|>+
|>+	return(0);
|>+}
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Fri Nov 17 05:06:51 2000
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Tue Nov 27 13:07:45 2001
|>@@ -6,8 +6,16 @@
|> extern int mktrace_cmd(command_t *), mktrace_parse(command_t *);
|> extern void mktrace_help(command_t *), mktrace_usage(command_t *);
|>
|>+extern int rd_cmd(command_t *), rd_parse(command_t *);
|>+extern void rd_help(command_t *), rd_usage(command_t *);
|>+
|>+extern int defcpu_cmd(command_t *), defcpu_parse(command_t *);
|>+extern void defcpu_help(command_t *), defcpu_usage(command_t *);
|>+
|> _command_t i386_cmdset[] = {
|> 	{"mktrace", 0, mktrace_cmd, mktrace_parse, mktrace_help, mktrace_usage},
|> 	{"mt", "mktrace" },
|>+	{"rd", 0, rd_cmd, rd_parse, rd_help, rd_usage},
|>+	{"defcpu", 0, defcpu_cmd, defcpu_parse, defcpu_help, defcpu_usage},
|> 	{(char *)0 }
|> };
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c	Tue Jul  3 19:37:36 2001
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c	Mon Nov 26 13:22:32 2001
|>@@ -741,9 +741,9 @@
|> 		return(1);
|> 	} else {
|> 		saddr = kl_kernelstack(task);
|>-		if (task == kl_dumptask()) {
|>-			eip = kl_dumpeip();
|>-			esp = kl_dumpesp();
|>+		if (kl_smp_dumptask(task)) {
|>+			eip = kl_dumpeip(task);
|>+			esp = kl_dumpesp(task);
|> 		} else {
|> 			if (LINUX_2_2_X(KL_LINUX_RELEASE)) {
|> 				eip = KL_UINT(K_PTR(tsp, "task_struct", "tss"),
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c
|>--- lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c	Thu Oct 12 02:32:54 2000
|>+++ lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c	Mon Nov 26 13:11:08 2001
|>@@ -9,7 +9,7 @@
|> /*
|>  * get_dump_header()
|>  */
|>-static int
|>+int
|> get_dump_header(dump_header_t *dump_header)
|> {
|> 	/* first, make sure this isn't a live system
|>@@ -42,7 +42,7 @@
|> /*
|>  * get_dump_header_asm()
|>  */
|>-static int
|>+int
|> get_dump_header_asm(dump_header_asm_t *dump_header_asm)
|> {
|> 	dump_header_t dump_header;
|>@@ -90,36 +90,40 @@
|>  * kl_dumpesp()
|>  */
|> kaddr_t
|>-kl_dumpesp(void)
|>+kl_dumpesp(kaddr_t tsk)
|> {
|>-	dump_header_asm_t dump_header_asm;
|>+	dump_header_asm_t dha;
|>+	int i;
|>
|>-	if (get_dump_header_asm(&dump_header_asm)) {
|>+	if (get_dump_header_asm(&dha)) {
|> 		return((kaddr_t)NULL);
|> 	}
|>-	if (dump_header_asm.dha_regs.esp) {
|>-		return((kaddr_t)dump_header_asm.dha_regs.esp);
|>-	} else {
|>-		return((kaddr_t)dump_header_asm.dha_esp);
|>+
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (tsk == dha.dha_smp_current_task[i])
|>+			return (dha.dha_smp_regs[i].esp);
|> 	}
|>+	return((kaddr_t)NULL);
|> }
|>
|> /*
|>  * kl_dumpeip()
|>  */
|> kaddr_t
|>-kl_dumpeip(void)
|>+kl_dumpeip(kaddr_t tsk)
|> {
|>-	dump_header_asm_t dump_header_asm;
|>+	dump_header_asm_t dha;
|>+	int i;
|>
|>-	if (get_dump_header_asm(&dump_header_asm)) {
|>+	if (get_dump_header_asm(&dha)) {
|> 		return((kaddr_t)NULL);
|> 	}
|>-	if (dump_header_asm.dha_regs.eip) {
|>-		return((kaddr_t)dump_header_asm.dha_regs.eip);
|>-	} else {
|>-		return((kaddr_t)dump_header_asm.dha_eip);
|>+
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (tsk == dha.dha_smp_current_task[i])
|>+			return (dha.dha_smp_regs[i].eip);
|> 	}
|>+	return((kaddr_t)NULL);
|> }
|>
|> /*
|>@@ -134,5 +138,23 @@
|> 		return((kaddr_t)NULL);
|> 	}
|> 	return((kaddr_t)dump_header.dh_current_task);
|>+
|>+}
|>+
|>+int
|>+kl_smp_dumptask(kaddr_t tsk)
|>+{
|>+	dump_header_asm_t dha;
|>+	int i;
|>+
|>+	if (get_dump_header_asm(&dha)) {
|>+		return((kaddr_t)NULL);
|>+	}
|>+
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (dha.dha_smp_regs[i].eip > KL_PAGE_OFFSET && tsk == dha.dha_smp_current_task[i])
|>+			return (1);
|>+	}
|>+	return (0);
|> }
|>
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h
|>--- lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h	Wed Sep  5 13:38:00 2001
|>+++ lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h	Mon Nov 26 15:41:09 2001
|>@@ -4,7 +4,8 @@
|>  * Created by: Matt Robinson (yakker@sgi.com)
|>  *
|>  * Copyright 1999 Silicon Graphics, Inc. All rights reserved.
|>- *
|>+ *
|>+ * This code is released under version 2 of the GNU GPL.
|>  */
|>
|> /* This header file holds the architecture specific crash dump header */
|>@@ -13,6 +14,7 @@
|>
|> /* necessary header files */
|> #include <asm/ptrace.h>                          /* for pt_regs             */
|>+#include <linux/threads.h>
|>
|> /* definitions */
|> #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
|>@@ -44,17 +46,44 @@
|> 	/* the dump registers */
|> 	struct pt_regs       dha_regs;
|>
|>+	/* smp specific */
|>+	uint32_t	     dha_smp_num_cpus;
|>+	int		     dha_dumping_cpu;
|>+	struct pt_regs	     dha_smp_regs[NR_CPUS];
|>+	void *		     dha_smp_current_task[NR_CPUS];
|>+	void *		     dha_stack[NR_CPUS];
|> } dump_header_asm_t;
|>
|> #ifdef __KERNEL__
|>-extern void __dump_open(struct file *, uint64_t);
|>-extern void __dump_init(uint64_t);
|>-extern void __dump_silence_system(void);
|>-extern void __dump_resume_system(void);
|>-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
|>-#ifdef CONFIG_X86
|>-extern void __dump_save_panic_regs(dump_header_asm_t *);
|>-#endif
|>+static inline void get_current_regs(struct pt_regs *regs)
|>+{
|>+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
|>+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
|>+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
|>+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
|>+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
|>+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
|>+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
|>+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
|>+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
|>+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
|>+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
|>+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
|>+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
|>+	regs->eip = (unsigned long)current_text_addr();
|>+
|>+}
|>+
|>+extern volatile int dump_in_progress;
|>+extern unsigned long irq_affinity[];
|>+extern dump_header_asm_t dump_header_asm;
|>+
|>+#ifdef CONFIG_SMP
|>+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
|>+extern void dump_send_ipi(void);
|>+#else
|>+#define dump_send_ipi()
|> #endif
|>+#endif /* __KERNEL__ */
|>
|> #endif /* _ASM_DUMP_H */
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h
|>--- lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h	Thu Oct 12 02:32:54 2000
|>+++ lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h	Mon Nov 26 13:17:16 2001
|>@@ -9,7 +9,8 @@
|> int kl_parent_pid(void *);
|> kaddr_t kl_pid_to_task(kaddr_t);
|> k_error_t kl_get_task_struct(kaddr_t, int, void *);
|>-kaddr_t kl_dumpeip(void);
|>-kaddr_t kl_dumpesp(void);
|>+kaddr_t kl_dumpeip(kaddr_t tsk);
|>+kaddr_t kl_dumpesp(kaddr_t tsk);
|>+int kl_smp_dumptask(kaddr_t tsk);
|> kaddr_t kl_dumptask(void);
|> kaddr_t kl_kernelstack(kaddr_t);
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c lkcd_cvs_new/lkcdutils/libklib/kl_memory.c
|>--- lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c	Fri Nov 23 17:25:35 2001
|>+++ lkcd_cvs_new/lkcdutils/libklib/kl_memory.c	Mon Nov 26 13:15:58 2001
|>@@ -123,6 +123,34 @@
|> 	return((meminfo_t *)NULL);
|> }
|>
|>+extern int get_dump_header_asm(dump_header_asm_t *dha);
|>+kaddr_t
|>+__kl_fix_vaddr(kaddr_t vaddr, size_t sz)
|>+{
|>+	dump_header_asm_t dha;
|>+	kaddr_t cur_task;
|>+	int i;
|>+
|>+	if (MIP->core_type != reg_core) {
|>+		return vaddr;
|>+	}
|>+	if (get_dump_header_asm(&dha))
|>+		return vaddr;
|>+
|>+	/* this is a very simplistic check to see if we have saved
|>+	 * (snapshotted) this particular block. This is very limited
|>+	 * to finding the saved task structs only.
|>+	 */
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (dha.dha_smp_regs[i].eip < KL_PAGE_OFFSET)
|>+			continue; /* if task is in user space, no need to look at saved stack */
|>+		cur_task = dha.dha_smp_current_task[i];
|>+		if (vaddr >= cur_task && vaddr + sz <  cur_task + KSTACK_SIZE)
|>+			return (dha.dha_stack[i] + (vaddr - cur_task));
|>+	}
|>+	return vaddr;
|>+}
|>+
|> /*
|>  * get_block()
|>  *
|>@@ -142,13 +170,16 @@
|> 		KL_ERROR = KLE_ZERO_SIZE;
|> 	} else {
|> 		while (size > 0){
|>+			kaddr_t tmp = vaddr;
|> 			s=((vaddr & KL_PAGE_MASK) | (~KL_PAGE_MASK)) -
|> 				vaddr + 1;
|> 			s= (size > s) ? s : size;
|>+			vaddr = __kl_fix_vaddr(vaddr, s);
|> 			if ( kl_virtop(vaddr, mmap, &paddr) ) {
|> 				return(KL_ERROR);
|> 			}
|> 			kl_readmem(paddr, s, bp);
|>+			vaddr = tmp;
|> 			size=size - s;
|> 			vaddr=vaddr + s;
|> 			bp=bp + s;
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c
|>--- lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c	Fri Nov 23 17:25:37 2001
|>+++ lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c	Mon Nov 26 16:35:23 2001
|>@@ -242,7 +242,7 @@
|>
|> 	/* set dump compression */
|> 	if (compress_set == DUMP_TRUE) {
|>-		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)&compress)) < 0) {
|>+		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)compress)) < 0) {
|> 			perror("ioctl() for dump compression failed");
|> 			close(dfd);
|> 			return (err);
|>@@ -251,7 +251,7 @@
|>
|> 	/* set dump flags */
|> 	if (flags_set == DUMP_TRUE) {
|>-		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)&flags)) < 0) {
|>+		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)flags)) < 0) {
|> 			perror("ioctl() for dump flags failed");
|> 			close(dfd);
|> 			return (err);
|>@@ -260,7 +260,7 @@
|>
|> 	/* set dump level */
|> 	if (level_set == DUMP_TRUE) {
|>-		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)&level)) < 0) {
|>+		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)level)) < 0) {
|> 			perror("ioctl() for dump level failed");
|> 			close(dfd);
|> 			return (err);
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From yakker@aparity.com Thu Nov 29 00:37:26 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169Mh6-0001tn-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 29 Nov 2001 00:37:20 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fAT8cZJ29493;
	Thu, 29 Nov 2001 00:38:35 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Vamsi Krishna S ." <vamsi@in.ibm.com>
cc: <lkcd@oss.sgi.com>, <lkcd-general@lists.sourceforge.net>,
   bharata <bharata@in.ibm.com>, suparna <bsuparna@in.ibm.com>,
   subodh <subodh@in.ibm.com>
Subject: Re: [lkcd-general] [PATCH]capturing registers/stack on all processors
In-Reply-To: <20011127143019.A8322@in.ibm.com>
Message-ID: <Pine.LNX.4.30.0111290020130.29286-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 29 00:38:03 2001
X-Original-Date: Thu, 29 Nov 2001 00:38:35 -0800 (PST)

Looks good, Vamsi.  A couple of points:

- You'll need to fix the cmds.c file to deal with the NULL argument
  at the end of the new commands (or it will conflict with Naomi-san's
  latest code)
- Why hw_irq.h?  I didn't see any CONFIG_DUMP stuff in there ...
- I think you need to add the printk("Dump "); for the default case and
  add CONFIG_DUMP_MODULE for the 'd' command in sysrq.c (unless there's
  a reason not to have it?

My only other concern, which isn't that big, is that someone will
complain if we try to add show_this_cpu_state() into 2.5, as it
is mostly duplicate code.

Great job, Vamsi.  Are all of you on #lkcd?

--Matt

On Tue, 27 Nov 2001, Vamsi Krishna S . wrote:
|>Hello,
|>
|>Here is a patch against lkcd cvs (as on 11/26/2001) for capturing
|>registers on all processors at the time of dumping.
|>
|>This has been found to be crucial to debug problems where some of
|>the cpus on an SMP are hung (executing a tight loop, interrupts
|>disabled).
|>
|>We send an NMI-class IPI to other cpus to capture the registers
|>and stack. This is the only guaranteed way to ensure that other
|>cpus respond. If they don't respond to NMI, there is absolutely
|>nothing we can do in software.
|>
|>We need to capture the stack, even though we would prefer not to.
|>The reason being that the stack could change between the time the
|>registers are captured and the time that page is written out in
|>the dumping process. The chages in the stack could be so
|>significant as to render backtracing impossible/totally inaccurate.
|>
|>Currently, all the changes we made are specific to i386, even
|>though many of the changes could have been arch-independent.
|>
|>Brief list of chages:
|>
|>kernel:
|>- extensions to dump_header_asm_t to add fields to capture:
|>	- smp_num_cpus and dumping_cpu
|>	- registers of all processors
|>	- pointers to current tasks
|>	- pointers to the location where stacks are saved
|>- remove __dump_save_panic_regs
|>- collect registers in panic()
|>- remove all use of dha_esp, dha_eip, dha_regs and use
|>  dha_smp_regs consistantly
|>- cleanup dump_configure_header handling, ie, do it only
|>  once in dump_execute
|>- send NMI to all processors and capture their registers,
|>  current task and kernel stack as part of
|>  __configure_dump_header
|>- [bonus] new magic sysrq key 'd' to show the registers
|>  and, backtrace if inside kernel, on all processors
|>- [side effect] as part of capturing registers on panic
|>  we now seem to be able to backtrace correctly in
|>  panic dump cases.
|>
|>lcrash:
|>- new commands
|>	- rd
|>	- defcpu
|>- rd to display registers captured at the time of taking the
|>  dump on the processor which is currently the defcpu
|>- defcpu to set the default cpu and set deftask to the current
|>  task on that cpu at the time of dump
|>- new kl_smp_dumptask to determine while backtracing if this
|>  task is a current task on any of the processors at the time
|>  of dump
|>- changes to kl_dumpesp/kl_dumpeip to get the esp/eip values
|>  from dha_smp_regs.
|>- changes to get_block() to look at the saved stack if this
|>  task is a current task on any of the processors and was
|>  inside the kernel when the dump was taken
|>- [unrelated bug fix] fix lkcd_config.c to pass the values of
|>  dump level, dump flags and compression_type instead of their
|>  addresses to the ioctl call to set them.
|>
|>
|>--
|>LKCD Team India
|>Linux Technology Center,
|>IBM Software Lab, Bangalore.
|>Ph: +91 80 5044959
|>Internet: vamsi@in.ibm.com
|>
|>--
|>
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c
|>--- lkcd_cvs_orig/2.4/arch/i386/kernel/i386_ksyms.c	Mon Sep 24 15:31:42 2001
|>+++ lkcd_cvs_new/2.4/arch/i386/kernel/i386_ksyms.c	Mon Nov 26 14:03:33 2001
|>@@ -31,8 +31,6 @@
|>
|> extern void dump_thread(struct pt_regs *, struct user *);
|> extern spinlock_t rtc_lock;
|>-extern irq_desc_t irq_desc[];
|>-extern unsigned long irq_affinity[];
|>
|> #if defined(CONFIG_APM) || defined(CONFIG_APM_MODULE)
|> extern void machine_real_restart(unsigned char *, int);
|>@@ -150,8 +148,6 @@
|> #endif
|>
|> EXPORT_SYMBOL(get_wchan);
|>-EXPORT_SYMBOL(irq_affinity);
|>-EXPORT_SYMBOL(irq_desc);
|>
|> EXPORT_SYMBOL(rtc_lock);
|>
|>@@ -164,4 +160,17 @@
|>
|> #ifdef CONFIG_X86_PAE
|> EXPORT_SYMBOL(empty_zero_page);
|>+#endif
|>+
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+extern irq_desc_t irq_desc[];
|>+extern unsigned long irq_affinity[];
|>+EXPORT_SYMBOL(irq_affinity);
|>+EXPORT_SYMBOL(irq_desc);
|>+#ifdef CONFIG_SMP
|>+extern void dump_send_ipi(void);
|>+EXPORT_SYMBOL(dump_send_ipi);
|>+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
|>+EXPORT_SYMBOL(dump_ipi_function_ptr);
|>+#endif
|> #endif
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c lkcd_cvs_new/2.4/arch/i386/kernel/smp.c
|>--- lkcd_cvs_orig/2.4/arch/i386/kernel/smp.c	Tue Oct 16 12:51:44 2001
|>+++ lkcd_cvs_new/2.4/arch/i386/kernel/smp.c	Mon Nov 26 14:06:02 2001
|>@@ -142,6 +142,15 @@
|> 	 */
|> 	cfg = __prepare_ICR(shortcut, vector);
|>
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	if (vector == DUMP_VECTOR) {
|>+		/*
|>+		 * Setup DUMP IPI to be delivered as an NMI
|>+		 */
|>+		cfg = (cfg&~APIC_VECTOR_MASK)|APIC_DM_NMI;
|>+	}
|>+#endif	/* CONFIG_DUMP */
|>+
|> 	/*
|> 	 * Send the IPI. The write to APIC_ICR fires this off.
|> 	 */
|>@@ -424,6 +433,13 @@
|>
|> 	do_flush_tlb_all_local();
|> }
|>+
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+void dump_send_ipi(void)
|>+{
|>+	send_IPI_allbutself(DUMP_VECTOR);
|>+}
|>+#endif
|>
|> /*
|>  * this function sends a 'reschedule' IPI to another CPU.
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c lkcd_cvs_new/2.4/arch/i386/kernel/traps.c
|>--- lkcd_cvs_orig/2.4/arch/i386/kernel/traps.c	Wed Sep 26 15:16:15 2001
|>+++ lkcd_cvs_new/2.4/arch/i386/kernel/traps.c	Mon Nov 26 16:46:48 2001
|>@@ -89,6 +89,105 @@
|>
|> int kstack_depth_to_print = 24;
|>
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+/*
|>+ * This code mimics show_trace() etc in arch/i386/kernel/traps.c. We don't
|>+ * use them directly as they depend on 8K aligned kernel stacks that our
|>+ * saved stacks don't satisfy. However, there is move to relax the requirement
|>+ * on task_struct to be 8K-aligned. Once that happens, we could simpify this
|>+ * function.
|>+ */
|>+void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk)
|>+{
|>+	int i;
|>+	unsigned long *esp;
|>+	unsigned char *c;
|>+	int in_kernel = 1;
|>+
|>+	esp = (unsigned long *)regs->esp;
|>+	c = (unsigned char *)regs->eip;
|>+
|>+	if (regs->xcs & 3) {
|>+		in_kernel = 0;
|>+	}
|>+	printk("CPU:    %d\nEIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
|>+		cpu, 0xffff & regs->xcs, regs->eip, regs->eflags);
|>+	printk("eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
|>+		regs->eax, regs->ebx, regs->ecx, regs->edx);
|>+	printk("esi: %08lx   edi: %08lx   ebp: %08lx   esp: %p\n",
|>+		regs->esi, regs->edi, regs->ebp, esp);
|>+	printk("ds: %04x   es: %04x   ss: %04x\n",
|>+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
|>+	if (!tsk) {
|>+		printk("no stack for this cpu\n");
|>+		return;
|>+	}
|>+	printk("Process %s (pid: %d, stackpage=%08lx)",
|>+		tsk->comm, tsk->pid, 4096+(regs->esp & ~(THREAD_SIZE-1)));
|>+	/*
|>+	 * When in-kernel, we also print out the stack and code at the
|>+	 * time of the fault..
|>+	 */
|>+	if (in_kernel) {
|>+		unsigned long *stack;
|>+		unsigned long addr, module_start, module_end;
|>+		extern char _stext, _etext;
|>+
|>+		extern int kstack_depth_to_print;
|>+
|>+		esp = (unsigned long *)((unsigned long)tsk + (regs->esp & (THREAD_SIZE-1)));
|>+
|>+		printk("\nStack: ");
|>+		stack = esp;
|>+		for(i=0; i < kstack_depth_to_print; i++) {
|>+			if ((unsigned long)stack > (unsigned long)tsk + THREAD_SIZE-1)
|>+				break;
|>+			if (i && ((i % 8) == 0))
|>+				printk("\n       ");
|>+			printk("%08lx ", *stack++);
|>+		}
|>+
|>+		printk("\nCall Trace: ");
|>+		i = 1;
|>+		stack = esp;
|>+		module_start = VMALLOC_START;
|>+		module_end = VMALLOC_END;
|>+		module_end = 0;
|>+		while ((unsigned long)stack < (unsigned long)tsk + THREAD_SIZE) {
|>+			addr = *stack++;
|>+			/*
|>+			 * If the address is either in the text segment of the
|>+			 * kernel, or in the region which contains vmalloc'ed
|>+			 * memory, it *may* be the address of a calling
|>+			 * routine; if so, print it so that someone tracing
|>+			 * down the cause of the crash will be able to figure
|>+			 * out the call path that was taken.
|>+			 */
|>+			if (((addr >= (unsigned long) &_stext) &&
|>+			     (addr <= (unsigned long) &_etext)) ||
|>+			    ((addr >= module_start) && (addr <= module_end))) {
|>+				if (i && ((i % 8) == 0))
|>+					printk("\n       ");
|>+				printk("[<%08lx>] ", addr);
|>+				i++;
|>+			}
|>+		}
|>+		printk("\n");
|>+
|>+		printk("\nCode: ");
|>+		if(regs->eip < PAGE_OFFSET) {
|>+			printk("eip in user space. error.\n");
|>+		}
|>+
|>+		for(i=0;i<20;i++) {
|>+			printk("%02x ", *c++);
|>+		}
|>+	}
|>+	printk("\n");
|>+	return;
|>+}
|>+#endif /* CONFIG_DUMP */
|>+
|> /*
|>  * These constants are for searching for possible module text
|>  * segments.
|>@@ -471,12 +570,33 @@
|> }
|> #endif
|>
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+#ifdef CONFIG_SMP
|>+int (*dump_ipi_function_ptr)(struct pt_regs *) = NULL;
|>+static int dump_ipi(struct pt_regs *regs)
|>+{
|>+	if (!(dump_ipi_function_ptr && dump_ipi_function_ptr(regs))) {
|>+		return 0;
|>+	}
|>+	ack_APIC_irq();
|>+	return 1;
|>+}
|>+#else
|>+#define dump_ipi(regs) 0
|>+#endif
|>+#endif
|>+
|> asmlinkage void do_nmi(struct pt_regs * regs, long error_code)
|> {
|> 	unsigned char reason = inb(0x61);
|>
|>
|> 	++nmi_count(smp_processor_id());
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	if (dump_ipi(regs)) {
|>+		return;
|>+	}
|>+#endif
|> 	if (!(reason & 0xc0)) {
|> #if CONFIG_X86_IO_APIC
|> 		/*
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/char/sysrq.c lkcd_cvs_new/2.4/drivers/char/sysrq.c
|>--- lkcd_cvs_orig/2.4/drivers/char/sysrq.c	Fri Nov 23 17:25:29 2001
|>+++ lkcd_cvs_new/2.4/drivers/char/sysrq.c	Tue Nov 27 14:01:47 2001
|>@@ -96,6 +96,15 @@
|> 		dump("sysrq", pt_regs);
|> 		break;
|> #endif
|>+#if defined(CONFIG_DUMP)
|>+	case 'd':
|>+		{
|>+		extern void show_cpu_state(struct pt_regs *);
|>+		printk("Show state of all cpus\n");
|>+		show_cpu_state(pt_regs);
|>+		break;
|>+		}
|>+#endif
|>
|> 	case 'o':					    /* O -- power off */
|> 		if (sysrq_power_off) {
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_base.c lkcd_cvs_new/2.4/drivers/dump/dump_base.c
|>--- lkcd_cvs_orig/2.4/drivers/dump/dump_base.c	Fri Nov 23 17:25:30 2001
|>+++ lkcd_cvs_new/2.4/drivers/dump/dump_base.c	Mon Nov 26 16:34:44 2001
|>@@ -268,14 +268,12 @@
|> extern struct new_utsname system_utsname;     /* system information        */
|>
|> /* external architecture-specific functions */
|>-extern void __dump_open(struct file *, uint64_t);
|>+extern void __dump_open(void);
|>+extern void __dump_cleanup(void);
|> extern void __dump_init(uint64_t);
|>-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
|>+extern int __dump_configure_header(struct pt_regs *);
|> extern unsigned int  __dump_silence_system(unsigned int);
|> extern unsigned int  __dump_resume_system(unsigned int);
|>-#ifdef CONFIG_X86
|>-extern void __dump_save_panic_regs(dump_header_asm_t *);
|>-#endif
|>
|> /* external functions                                                      */
|> extern void si_meminfo(struct sysinfo *);
|>@@ -736,7 +734,7 @@
|> 	}
|>
|> 	/* configure architecture-specific dump header values */
|>-	if (!__dump_configure_header(&dump_header_asm, regs)) {
|>+	if (!__dump_configure_header(regs)) {
|> 		return (0);
|> 	}
|> 	return (1);
|>@@ -792,16 +790,11 @@
|>  *       memory pages and dumps the data to disk (using other functions).
|>  */
|> static int
|>-dump_execute_memdump(char *panic_str, struct pt_regs *regs)
|>+dump_execute_memdump(void)
|> {
|> 	int counter = 0, state = 0;
|> 	unsigned long mem_loc, buf_loc;
|>
|>-	if (!dump_configure_header(panic_str, regs)) {
|>-		DUMP_PRINT("Dump header could not be configured!");
|>-		return (-1);
|>-	}
|>-
|> 	DUMP_PRINT("\nDump compression value is 0x%x ...", dump_compress);
|>
|> 	DUMP_PRINT("\nWriting dump header ...");
|>@@ -939,39 +932,23 @@
|> 		return;
|> 	}
|>
|>+	if(!dump_configure_header(panic_str, regs)) {
|>+		DUMP_PRINT("\ndump header could not be configured!");
|>+		return;
|>+	}
|>+
|> 	/* silence the system */
|> 	dump_silence_system();
|>
|> 	/* bail out if we're not going to do any dumping */
|> 	if (dump_level != DUMP_LEVEL_NONE) {
|> 		/* inform users of what we are about to do */
|>-#ifdef CONFIG_SMP
|> 		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
|> 			dump_device, bdevname(dump_device),
|> 			smp_processor_id());
|>-#else
|>-		DUMP_PRINTN("Dumping to device 0x%x [%s] on CPU %d ...",
|>-			dump_device, bdevname(dump_device),
|>-			0);
|>-#endif
|>
|> 		/* start walking through the page tables */
|>-		state = dump_execute_memdump(panic_str, regs);
|>-
|>-#ifdef CONFIG_X86
|>-		/*
|>-		 * Okay, this is REALLY annoying to have to
|>-		 * do.  What this means is that for x86
|>-		 * systems, we have to literally save the
|>-		 * esp/eip _now_, because we don't want the
|>-		 * esp/eip from dump_write_header() or
|>-		 * anything it calls to conflict with
|>-		 * re-building the panic() stack trace case.
|>-		 * So for that reason, we save the eip/esp
|>-		 * now so we can re-build the trace later.
|>-		 */
|>-		__dump_save_panic_regs(&dump_header_asm);
|>-#endif
|>+		state = dump_execute_memdump();
|>
|> 		/* update header to disk for the last time */
|> 		if (dump_write_header() < 0) {
|>@@ -1054,7 +1031,7 @@
|> 	struct list_head *tmp;
|> 	dump_compress_t *dc;
|>
|>-	/* try to remove the compression item */
|>+	/* try to set the compression type*/
|> 	list_for_each(tmp, &dump_compress_list) {
|> 		dc = list_entry(tmp, dump_compress_t, list);
|> 		if (dc->compress_type == compression_type) {
|>@@ -1210,6 +1187,7 @@
|> 			if (!(f->f_flags & O_RDWR)) {
|> 				return (-EPERM);
|> 			}
|>+			__dump_open();
|> 			return (dump_open_kdev((kdev_t)arg));
|>
|> 		/* get dump_device */
|>@@ -1423,6 +1401,9 @@
|> 	if (dump_page_buf) {
|> 		kfree((const void *)dump_page_buf);
|> 	}
|>+
|>+	/* arch-specific cleanup routine */
|>+	__dump_cleanup();
|>
|> 	/* remove the proc entries */
|> 	dump_proc_cleanup();
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c lkcd_cvs_new/2.4/drivers/dump/dump_i386.c
|>--- lkcd_cvs_orig/2.4/drivers/dump/dump_i386.c	Thu Oct  4 14:19:49 2001
|>+++ lkcd_cvs_new/2.4/drivers/dump/dump_i386.c	Tue Nov 27 14:15:26 2001
|>@@ -21,50 +21,143 @@
|> #include <linux/kernel.h>
|> #include <linux/smp.h>
|> #include <linux/fs.h>
|>+#include <linux/vmalloc.h>
|> #include <linux/dump.h>
|> #include <linux/mm.h>
|> #include <asm/processor.h>
|> #include <asm/hardirq.h>
|> #include <linux/irq.h>
|>
|>-extern volatile int dump_in_progress;
|>-extern unsigned long irq_affinity[NR_IRQS];
|> static unsigned long saved_affinity[NR_IRQS];
|>
|>-/*
|>- * Name: __dump_save_panic_regs()
|>- * Func: Save the EIP (really the RA).  We may pass an argument later.
|>- * 	 Save ESP also here.
|>- */
|>-inline void
|>-__dump_save_panic_regs(dump_header_asm_t *dha)
|>-{
|>-	__asm__ __volatile__("movl  %%esp, %0\n"
|>-		: "=r" (dha->dha_esp));
|>-	/* hate to do this, but ... */
|>-#ifdef CONFIG_FRAME_POINTER
|>-	__asm__ __volatile__("movl  4(%%esp), %0\n"
|>-		: "=r" (dha->dha_eip));
|>+static int alloc_dha_stack(void)
|>+{
|>+	int i;
|>+	void *ptr;
|>+
|>+	if (dump_header_asm.dha_stack[0])
|>+		return 0;
|>+
|>+       	ptr = vmalloc(THREAD_SIZE * smp_num_cpus);
|>+	if (!ptr) {
|>+		printk("vmalloc for dha_stacks failed\n");
|>+		return -ENOMEM;
|>+	}
|>+
|>+	for (i = 0; i < smp_num_cpus; i++) {
|>+		dump_header_asm.dha_stack[i] = (void *)((unsigned long)ptr + (i * THREAD_SIZE));
|>+	}
|>+	return 0;
|>+}
|>+
|>+static int free_dha_stack(void)
|>+{
|>+	if (dump_header_asm.dha_stack[0])
|>+		vfree(dump_header_asm.dha_stack[0]);
|>+	return 0;
|>+}
|>+
|>+/* In case of panic dumps, we collects regs on entry to panic.
|>+ * so, we shouldn't 'fix' ssesp here again. But it is hard to
|>+ * tell just looking at regs whether ssesp need fixing. We make
|>+ * this decision by looking at xss in regs. If we have better
|>+ * means to determine that ssesp are valid (by some flag which
|>+ * tells that we are here due to panic dump), then we can use
|>+ * that instead of this kludge.
|>+ */
|>+static inline void
|>+fix_ssesp(struct pt_regs *regs, int cpu)
|>+{
|>+	if (!user_mode(regs)) {
|>+		if ((cpu == dump_header_asm.dha_dumping_cpu) &&
|>+			regs->xss == __KERNEL_DS)
|>+			return;
|>+		dump_header_asm.dha_smp_regs[cpu].esp =
|>+				(unsigned long)&(regs->esp);
|>+		__asm__ __volatile__ ("movw %%ss, %%ax;"
|>+			:"=a"(dump_header_asm.dha_smp_regs[cpu].xss));
|>+	}
|>+}
|>+
|>+static void
|>+save_this_cpu_state(int cpu, struct pt_regs *regs, struct task_struct *tsk)
|>+{
|>+	dump_header_asm.dha_smp_regs[cpu] = *regs;
|>+	dump_header_asm.dha_smp_current_task[cpu] = tsk;
|>+	fix_ssesp(regs, cpu);
|>+
|>+	if (dump_header_asm.dha_stack[cpu]) {
|>+		memcpy(dump_header_asm.dha_stack[cpu], tsk, THREAD_SIZE);
|>+	}
|>+	return;
|>+}
|>+
|>+#ifdef CONFIG_SMP
|>+static int dump_expect_ipi[NR_CPUS];
|>+static atomic_t waiting_for_dump_ipi;
|>+static int wait_for_dump_ipi = 1; /* always wait for ipi to to be handled */
|>+
|>+static int
|>+dump_ipi_handler(struct pt_regs *regs)
|>+{
|>+	int cpu = smp_processor_id();
|>+
|>+	if (!dump_expect_ipi[cpu]) {
|>+		return 0;
|>+	}
|>+
|>+	save_this_cpu_state(cpu, regs, current);
|>+
|>+	dump_expect_ipi[cpu] = 0;
|>+	atomic_dec(&waiting_for_dump_ipi);
|>+	return 1;
|>+}
|>+
|>+/* save registers on other processors */
|>+void
|>+save_other_cpu_states(void)
|>+{
|>+	int i;
|>+
|>+	if (smp_num_cpus > 1) {
|>+		atomic_set(&waiting_for_dump_ipi, smp_num_cpus-1);
|>+		for (i = 0; i < NR_CPUS; i++)
|>+			dump_expect_ipi[i] = 1;
|>+
|>+		dump_ipi_function_ptr = dump_ipi_handler;
|>+		dump_send_ipi();
|>+		/* may be we dont need to wait for NMI to be processed.
|>+		   just write out the header at the end of dumping, if
|>+		   this IPI is not processed untill then, there probably
|>+		   is a problem and we just fail to capture state of
|>+		   other cpus. */
|>+		if (wait_for_dump_ipi) {
|>+			while(atomic_read(&waiting_for_dump_ipi))
|>+				barrier();
|>+			dump_ipi_function_ptr = NULL;
|>+		}
|>+	}
|>+	return;
|>+}
|> #else
|>-	__asm__ __volatile__("movl  (%%esp), %0\n"
|>-		: "=r" (dha->dha_eip));
|>+#define save_other_cpu_states()
|> #endif
|>-}
|>
|> /*
|>  * Name: __dump_configure_header()
|>  * Func: Configure the dump header with all proper values.
|>  */
|> int
|>-__dump_configure_header(dump_header_asm_t *dha, struct pt_regs *regs)
|>+__dump_configure_header(struct pt_regs *regs)
|> {
|>-	/* save the dump specific esp/eip */
|>-	__dump_save_panic_regs(dha);
|>+	int cpu = smp_processor_id();
|>
|>-	/* one final check -- modify if we're in user mode */
|>-	if ((regs) && (!user_mode(regs))) {
|>-		dha->dha_regs.esp = (unsigned long) &(regs->esp);
|>-	}
|>+	dump_header_asm.dha_smp_num_cpus = smp_num_cpus;
|>+	dump_header_asm.dha_dumping_cpu = cpu;
|>+
|>+	save_this_cpu_state(cpu, regs, current);
|>+
|>+	save_other_cpu_states();
|>
|> 	return (1);
|> }
|>@@ -87,13 +180,28 @@
|>  *       case it's necessary in the future.
|>  */
|> void
|>-__dump_open(struct file *dump_file, uint64_t memory_size)
|>+__dump_open(void)
|> {
|>+	alloc_dha_stack();
|> 	/* return */
|> 	return;
|> }
|>
|> /*
|>+ * Name: __dump_cleanup()
|>+ * Func: Free any architecture specific data structures. This is called
|>+ *       when the dump module is being removed.
|>+ */
|>+void
|>+__dump_cleanup(void)
|>+{
|>+	free_dha_stack();
|>+	/* return */
|>+	return;
|>+}
|>+
|>+#ifdef CONFIG_SMP
|>+/*
|>  * Non dumping cpus will spin here. If a cpu is handling an irq when ipi is
|>  * received, we let go of it here while making sure that it hits schedule
|>  * on the way up and make it spin there instead.
|>@@ -108,6 +216,7 @@
|> 	}
|> 	return;
|> }
|>+#endif
|>
|> /*
|>  * Routine to save the old irq affinities and change affinities of all irqs to
|>@@ -179,4 +288,24 @@
|>
|> 	/* return */
|> 	return (0);
|>+}
|>+
|>+/* located in arch/i386/kernel/traps.c */
|>+extern void show_this_cpu_state(int cpu, struct pt_regs * regs, struct task_struct *tsk);
|>+
|>+void
|>+show_cpu_state(struct pt_regs * regs)
|>+{
|>+	int cpu = smp_processor_id();
|>+	int i;
|>+
|>+	__dump_configure_header(regs);
|>+
|>+	printk("__dump_configure_header done from cpu %d\n", cpu);
|>+
|>+	for (i = 0; i < smp_num_cpus; i++) {
|>+		show_this_cpu_state(i, dump_header_asm.dha_smp_regs[i], dump_header_asm.dha_stack[i]);
|>+	}
|>+
|>+	return;
|> }
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/dump.h lkcd_cvs_new/2.4/include/asm-i386/dump.h
|>--- lkcd_cvs_orig/2.4/include/asm-i386/dump.h	Wed Sep 26 15:21:38 2001
|>+++ lkcd_cvs_new/2.4/include/asm-i386/dump.h	Mon Nov 26 14:11:49 2001
|>@@ -14,6 +14,7 @@
|>
|> /* necessary header files */
|> #include <asm/ptrace.h>                          /* for pt_regs             */
|>+#include <linux/threads.h>
|>
|> /* definitions */
|> #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
|>@@ -45,6 +46,44 @@
|> 	/* the dump registers */
|> 	struct pt_regs       dha_regs;
|>
|>+	/* smp specific */
|>+	uint32_t	     dha_smp_num_cpus;
|>+	int		     dha_dumping_cpu;
|>+	struct pt_regs	     dha_smp_regs[NR_CPUS];
|>+	void *		     dha_smp_current_task[NR_CPUS];
|>+	void *		     dha_stack[NR_CPUS];
|> } dump_header_asm_t;
|>+
|>+#ifdef __KERNEL__
|>+static inline void get_current_regs(struct pt_regs *regs)
|>+{
|>+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
|>+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
|>+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
|>+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
|>+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
|>+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
|>+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
|>+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
|>+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
|>+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
|>+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
|>+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
|>+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
|>+	regs->eip = (unsigned long)current_text_addr();
|>+
|>+}
|>+
|>+extern volatile int dump_in_progress;
|>+extern unsigned long irq_affinity[];
|>+extern dump_header_asm_t dump_header_asm;
|>+
|>+#ifdef CONFIG_SMP
|>+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
|>+extern void dump_send_ipi(void);
|>+#else
|>+#define dump_send_ipi()
|>+#endif
|>+#endif /* __KERNEL__ */
|>
|> #endif /* _ASM_DUMP_H */
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h
|>--- lkcd_cvs_orig/2.4/include/asm-i386/hw_irq.h	Thu Jan  1 05:30:00 1970
|>+++ lkcd_cvs_new/2.4/include/asm-i386/hw_irq.h	Mon Nov 26 14:13:23 2001
|>@@ -0,0 +1,226 @@
|>+#ifndef _ASM_HW_IRQ_H
|>+#define _ASM_HW_IRQ_H
|>+
|>+/*
|>+ *	linux/include/asm/hw_irq.h
|>+ *
|>+ *	(C) 1992, 1993 Linus Torvalds, (C) 1997 Ingo Molnar
|>+ *
|>+ *	moved some of the old arch/i386/kernel/irq.h to here. VY
|>+ *
|>+ *	IRQ/IPI changes taken from work by Thomas Radke
|>+ *	<tomsoft@informatik.tu-chemnitz.de>
|>+ */
|>+
|>+#include <linux/config.h>
|>+#include <asm/atomic.h>
|>+#include <asm/irq.h>
|>+
|>+/*
|>+ * IDT vectors usable for external interrupt sources start
|>+ * at 0x20:
|>+ */
|>+#define FIRST_EXTERNAL_VECTOR	0x20
|>+
|>+#define SYSCALL_VECTOR		0x80
|>+
|>+/*
|>+ * Vectors 0x20-0x2f are used for ISA interrupts.
|>+ */
|>+
|>+/*
|>+ * Special IRQ vectors used by the SMP architecture, 0xf0-0xff
|>+ *
|>+ *  some of the following vectors are 'rare', they are merged
|>+ *  into a single vector (CALL_FUNCTION_VECTOR) to save vector space.
|>+ *  TLB, reschedule and local APIC vectors are performance-critical.
|>+ *
|>+ *  Vectors 0xf0-0xfa are free (reserved for future Linux use).
|>+ */
|>+#define SPURIOUS_APIC_VECTOR	0xff
|>+#define ERROR_APIC_VECTOR	0xfe
|>+#define INVALIDATE_TLB_VECTOR	0xfd
|>+#define RESCHEDULE_VECTOR	0xfc
|>+#define CALL_FUNCTION_VECTOR	0xfb
|>+#define DUMP_VECTOR		0xfa
|>+
|>+/*
|>+ * Local APIC timer IRQ vector is on a different priority level,
|>+ * to work around the 'lost local interrupt if more than 2 IRQ
|>+ * sources per level' errata.
|>+ */
|>+#define LOCAL_TIMER_VECTOR	0xef
|>+
|>+/*
|>+ * First APIC vector available to drivers: (vectors 0x30-0xee)
|>+ * we start at 0x31 to spread out vectors evenly between priority
|>+ * levels. (0x80 is the syscall vector)
|>+ */
|>+#define FIRST_DEVICE_VECTOR	0x31
|>+#define FIRST_SYSTEM_VECTOR	0xef
|>+
|>+extern int irq_vector[NR_IRQS];
|>+#define IO_APIC_VECTOR(irq)	irq_vector[irq]
|>+
|>+/*
|>+ * Various low-level irq details needed by irq.c, process.c,
|>+ * time.c, io_apic.c and smp.c
|>+ *
|>+ * Interrupt entry/exit code at both C and assembly level
|>+ */
|>+
|>+extern void mask_irq(unsigned int irq);
|>+extern void unmask_irq(unsigned int irq);
|>+extern void disable_8259A_irq(unsigned int irq);
|>+extern void enable_8259A_irq(unsigned int irq);
|>+extern int i8259A_irq_pending(unsigned int irq);
|>+extern void make_8259A_irq(unsigned int irq);
|>+extern void init_8259A(int aeoi);
|>+extern void FASTCALL(send_IPI_self(int vector));
|>+extern void init_VISWS_APIC_irqs(void);
|>+extern void setup_IO_APIC(void);
|>+extern void disable_IO_APIC(void);
|>+extern void print_IO_APIC(void);
|>+extern int IO_APIC_get_PCI_irq_vector(int bus, int slot, int fn);
|>+extern void send_IPI(int dest, int vector);
|>+
|>+extern unsigned long io_apic_irqs;
|>+
|>+extern atomic_t irq_err_count;
|>+extern atomic_t irq_mis_count;
|>+
|>+extern char _stext, _etext;
|>+
|>+#define IO_APIC_IRQ(x) (((x) >= 16) || ((1<<(x)) & io_apic_irqs))
|>+
|>+#define __STR(x) #x
|>+#define STR(x) __STR(x)
|>+
|>+#define SAVE_ALL \
|>+	"cld\n\t" \
|>+	"pushl %es\n\t" \
|>+	"pushl %ds\n\t" \
|>+	"pushl %eax\n\t" \
|>+	"pushl %ebp\n\t" \
|>+	"pushl %edi\n\t" \
|>+	"pushl %esi\n\t" \
|>+	"pushl %edx\n\t" \
|>+	"pushl %ecx\n\t" \
|>+	"pushl %ebx\n\t" \
|>+	"movl $" STR(__KERNEL_DS) ",%edx\n\t" \
|>+	"movl %edx,%ds\n\t" \
|>+	"movl %edx,%es\n\t"
|>+
|>+#define IRQ_NAME2(nr) nr##_interrupt(void)
|>+#define IRQ_NAME(nr) IRQ_NAME2(IRQ##nr)
|>+
|>+#define GET_CURRENT \
|>+	"movl %esp, %ebx\n\t" \
|>+	"andl $-8192, %ebx\n\t"
|>+
|>+/*
|>+ *	SMP has a few special interrupts for IPI messages
|>+ */
|>+
|>+	/* there is a second layer of macro just to get the symbolic
|>+	   name for the vector evaluated. This change is for RTLinux */
|>+#define BUILD_SMP_INTERRUPT(x,v) XBUILD_SMP_INTERRUPT(x,v)
|>+#define XBUILD_SMP_INTERRUPT(x,v)\
|>+asmlinkage void x(void); \
|>+asmlinkage void call_##x(void); \
|>+__asm__( \
|>+"\n"__ALIGN_STR"\n" \
|>+SYMBOL_NAME_STR(x) ":\n\t" \
|>+	"pushl $"#v"\n\t" \
|>+	SAVE_ALL \
|>+	SYMBOL_NAME_STR(call_##x)":\n\t" \
|>+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
|>+	"jmp ret_from_intr\n");
|>+
|>+#define BUILD_SMP_TIMER_INTERRUPT(x,v) XBUILD_SMP_TIMER_INTERRUPT(x,v)
|>+#define XBUILD_SMP_TIMER_INTERRUPT(x,v) \
|>+asmlinkage void x(struct pt_regs * regs); \
|>+asmlinkage void call_##x(void); \
|>+__asm__( \
|>+"\n"__ALIGN_STR"\n" \
|>+SYMBOL_NAME_STR(x) ":\n\t" \
|>+	"pushl $"#v"\n\t" \
|>+	SAVE_ALL \
|>+	"movl %esp,%eax\n\t" \
|>+	"pushl %eax\n\t" \
|>+	SYMBOL_NAME_STR(call_##x)":\n\t" \
|>+	"call "SYMBOL_NAME_STR(smp_##x)"\n\t" \
|>+	"addl $4,%esp\n\t" \
|>+	"jmp ret_from_intr\n");
|>+
|>+#define BUILD_COMMON_IRQ() \
|>+asmlinkage void call_do_IRQ(void); \
|>+__asm__( \
|>+	"\n" __ALIGN_STR"\n" \
|>+	"common_interrupt:\n\t" \
|>+	SAVE_ALL \
|>+	"pushl $ret_from_intr\n\t" \
|>+	SYMBOL_NAME_STR(call_do_IRQ)":\n\t" \
|>+	"jmp "SYMBOL_NAME_STR(do_IRQ));
|>+
|>+/*
|>+ * subtle. orig_eax is used by the signal code to distinct between
|>+ * system calls and interrupted 'random user-space'. Thus we have
|>+ * to put a negative value into orig_eax here. (the problem is that
|>+ * both system calls and IRQs want to have small integer numbers in
|>+ * orig_eax, and the syscall code has won the optimization conflict ;)
|>+ *
|>+ * Subtle as a pigs ear.  VY
|>+ */
|>+
|>+#define BUILD_IRQ(nr) \
|>+asmlinkage void IRQ_NAME(nr); \
|>+__asm__( \
|>+"\n"__ALIGN_STR"\n" \
|>+SYMBOL_NAME_STR(IRQ) #nr "_interrupt:\n\t" \
|>+	"pushl $"#nr"-256\n\t" \
|>+	"jmp common_interrupt");
|>+
|>+extern unsigned long prof_cpu_mask;
|>+extern unsigned int * prof_buffer;
|>+extern unsigned long prof_len;
|>+extern unsigned long prof_shift;
|>+
|>+/*
|>+ * x86 profiling function, SMP safe. We might want to do this in
|>+ * assembly totally?
|>+ */
|>+static inline void x86_do_profile (unsigned long eip)
|>+{
|>+	if (!prof_buffer)
|>+		return;
|>+
|>+	/*
|>+	 * Only measure the CPUs specified by /proc/irq/prof_cpu_mask.
|>+	 * (default is all CPUs.)
|>+	 */
|>+	if (!((1<<smp_processor_id()) & prof_cpu_mask))
|>+		return;
|>+
|>+	eip -= (unsigned long) &_stext;
|>+	eip >>= prof_shift;
|>+	/*
|>+	 * Don't ignore out-of-bounds EIP values silently,
|>+	 * put them into the last histogram slot, so if
|>+	 * present, they will show up as a sharp peak.
|>+	 */
|>+	if (eip > prof_len-1)
|>+		eip = prof_len-1;
|>+	atomic_inc((atomic_t *)&prof_buffer[eip]);
|>+}
|>+
|>+#ifdef CONFIG_SMP /*more of this file should probably be ifdefed SMP */
|>+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {
|>+	if (IO_APIC_IRQ(i))
|>+		send_IPI_self(IO_APIC_VECTOR(i));
|>+}
|>+#else
|>+static inline void hw_resend_irq(struct hw_interrupt_type *h, unsigned int i) {}
|>+#endif
|>+
|>+#endif /* _ASM_HW_IRQ_H */
|>diff -urN -X dontdiff lkcd_cvs_orig/2.4/kernel/panic.c lkcd_cvs_new/2.4/kernel/panic.c
|>--- lkcd_cvs_orig/2.4/kernel/panic.c	Tue Oct 16 12:51:46 2001
|>+++ lkcd_cvs_new/2.4/kernel/panic.c	Mon Nov 26 17:33:33 2001
|>@@ -56,6 +56,10 @@
|> #if defined(CONFIG_ARCH_S390)
|>         unsigned long caller = (unsigned long) __builtin_return_address(0);
|> #endif
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	struct pt_regs regs;
|>+	get_current_regs(&regs);
|>+#endif
|>
|> 	va_start(args, fmt);
|> 	vsprintf(buf, fmt, args);
|>@@ -78,7 +82,9 @@
|>
|> 	notifier_call_chain(&panic_notifier_list, 0, NULL);
|>
|>-	dump(buf, NULL);
|>+#if defined(CONFIG_DUMP) || defined(CONFIG_DUMP_MODULE)
|>+	dump(buf, &regs);
|>+#endif
|>
|> 	if (panic_timeout > 0)
|> 	{
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/Makefile	Fri Jan 26 02:42:01 2001
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/Makefile	Tue Nov 27 13:08:26 2001
|>@@ -8,7 +8,7 @@
|> include $(DEPTH)/commondefs
|>
|> TARGETS   = $(DEPTH)/libarch.a
|>-CFILES    = i386_cmds.c cmd_mktrace.c
|>+CFILES    = i386_cmds.c cmd_mktrace.c cmd_rd.c cmd_defcpu.c
|> OFILES    = $(CFILES:.c=.o)
|>
|> all: default
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Thu Jan  1 05:30:00 1970
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_defcpu.c	Tue Nov 27 13:30:01 2001
|>@@ -0,0 +1,88 @@
|>+#include <lcrash.h>
|>+
|>+extern int get_dump_header_asm(dump_header_asm_t *);
|>+
|>+int defcpu = -1;
|>+
|>+/*
|>+ * deftask_cmd() -- Run the 'deftask' command.
|>+ */
|>+int
|>+defcpu_cmd(command_t *cmd)
|>+{
|>+	dump_header_asm_t dha;
|>+	int cpu;
|>+
|>+	if (cmd->nargs == 0) {
|>+		if (defcpu == -1) {
|>+			fprintf(cmd->efp, "No default cpu set\n");
|>+		} else {
|>+			fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
|>+		}
|>+		return(0);
|>+	}
|>+
|>+	if (MIP->core_type != reg_core) {
|>+		fprintf(cmd->efp, "Can't use this command on live system\n");
|>+		return (1);
|>+	}
|>+	if (get_dump_header_asm(&dha))
|>+		return (1);
|>+
|>+	cpu = strtol(cmd->args[0], NULL, 10);
|>+
|>+	if (cpu >= dha.dha_smp_num_cpus) {
|>+		fprintf(cmd->efp, "Error setting defcpu to %s\n", cmd->args[0]);
|>+		return (1);
|>+	}
|>+	defcpu = cpu;
|>+	fprintf(cmd->ofp, "Default cpu is %d\n", defcpu);
|>+
|>+	if (dha.dha_stack[defcpu]) {
|>+		deftask = (kaddr_t)dha.dha_smp_current_task[defcpu];
|>+		fprintf(cmd->ofp, "Default task is 0x%x\n", deftask);
|>+	}
|>+	return (0);
|>+}
|>+
|>+#define _DEFCPU_USAGE	"[-w outfile] [cpu]"
|>+
|>+/*
|>+ * defcpu_usage() -- Print the usage string for the 'defcpu' command.
|>+ */
|>+void
|>+defcpu_usage(command_t *cmd)
|>+{
|>+	CMD_USAGE(cmd, _DEFCPU_USAGE);
|>+}
|>+
|>+/*
|>+ * defcpu_help() -- Print the help information for the 'defcpu' command.
|>+ */
|>+void
|>+defcpu_help(command_t *cmd)
|>+{
|>+	CMD_HELP(cmd, _DEFCPU_USAGE,
|>+	"Set the default cpu if one is indicated. Otherwise print the "
|>+	"value of default cpu."
|>+        "When 'lcrash' is run on a live system, defcpu has no "
|>+        "meaning.\n\n"
|>+	"This command also sets the default task to the task running "
|>+	"on the default cpu at the time the dump is taken. "
|>+	"The rd command will display the registers on the default cpu "
|>+	"at the time the dump is taken. "
|>+        "The trace command will display a trace wrt the task "
|>+        "running on the default cpu at the time the dump is taken. ");
|>+}
|>+
|>+/*
|>+ * defcpu_parse() -- Parse the command line arguments for 'defcpu'.
|>+ */
|>+int
|>+defcpu_parse(command_t *cmd)
|>+{
|>+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
|>+		return(1);
|>+	}
|>+	return(0);
|>+}
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Thu Jan  1 05:30:00 1970
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/cmd_rd.c	Mon Nov 26 16:43:31 2001
|>@@ -0,0 +1,65 @@
|>+#include <lcrash.h>
|>+
|>+extern int get_dump_header_asm(dump_header_asm_t *dump_header_asm);
|>+extern int defcpu;
|>+
|>+#define _RD_USAGE "[-w outfile]"
|>+
|>+void
|>+rd_usage(command_t *cmd)
|>+{
|>+	CMD_USAGE(cmd, _RD_USAGE);
|>+}
|>+
|>+void
|>+rd_help(command_t *cmd)
|>+{
|>+	CMD_HELP(cmd, _RD_USAGE,
|>+			"Display the register contents of the default cpu."
|>+			"This command can't be used on a live system ");
|>+}
|>+
|>+int
|>+rd_parse(command_t *cmd)
|>+{
|>+	if (set_cmd_flags(cmd, (C_WRITE), 0)) {
|>+		return(1);
|>+	}
|>+	return 0;
|>+}
|>+
|>+int
|>+rd_cmd(command_t *cmd)
|>+{
|>+	dump_header_asm_t dha;
|>+	struct pt_regs * regs;
|>+
|>+	if (cmd->nargs != 0) {
|>+		rd_usage(cmd);
|>+		return(1);
|>+	}
|>+
|>+	if (MIP->core_type != reg_core) {
|>+		fprintf(cmd->efp, "Can't use this command on live system\n");
|>+		return(1);
|>+	}
|>+
|>+	if (get_dump_header_asm(&dha))
|>+		return(1);
|>+
|>+	if (defcpu == -1)
|>+		defcpu = dha.dha_dumping_cpu;
|>+
|>+	regs = &dha.dha_smp_regs[defcpu];
|>+
|>+	fprintf(cmd->ofp, "CPU:    %d   EIP:    %04x:[<%08lx>]\nEFLAGS: %08lx\n",
|>+		defcpu, regs->xcs & 0xffff, regs->eip, regs->eflags);
|>+	fprintf(cmd->ofp, "eax: %08lx   ebx: %08lx   ecx: %08lx   edx: %08lx\n",
|>+		regs->eax, regs->ebx, regs->ecx, regs->edx);
|>+	fprintf(cmd->ofp, "esi: %08lx   edi: %08lx   ebp: %08lx   esp: %08lx\n",
|>+		regs->esi, regs->edi, regs->ebp, regs->esp);
|>+	fprintf(cmd->ofp, "ds: %04x   es: %04x   ss: %04x\n",
|>+		regs->xds & 0xffff, regs->xes & 0xffff, regs->xss & 0xffff);
|>+
|>+	return(0);
|>+}
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Fri Nov 17 05:06:51 2000
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/cmds/i386_cmds.c	Tue Nov 27 13:07:45 2001
|>@@ -6,8 +6,16 @@
|> extern int mktrace_cmd(command_t *), mktrace_parse(command_t *);
|> extern void mktrace_help(command_t *), mktrace_usage(command_t *);
|>
|>+extern int rd_cmd(command_t *), rd_parse(command_t *);
|>+extern void rd_help(command_t *), rd_usage(command_t *);
|>+
|>+extern int defcpu_cmd(command_t *), defcpu_parse(command_t *);
|>+extern void defcpu_help(command_t *), defcpu_usage(command_t *);
|>+
|> _command_t i386_cmdset[] = {
|> 	{"mktrace", 0, mktrace_cmd, mktrace_parse, mktrace_help, mktrace_usage},
|> 	{"mt", "mktrace" },
|>+	{"rd", 0, rd_cmd, rd_parse, rd_help, rd_usage},
|>+	{"defcpu", 0, defcpu_cmd, defcpu_parse, defcpu_help, defcpu_usage},
|> 	{(char *)0 }
|> };
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c
|>--- lkcd_cvs_orig/lkcdutils/lcrash/arch/i386/lib/trace.c	Tue Jul  3 19:37:36 2001
|>+++ lkcd_cvs_new/lkcdutils/lcrash/arch/i386/lib/trace.c	Mon Nov 26 13:22:32 2001
|>@@ -741,9 +741,9 @@
|> 		return(1);
|> 	} else {
|> 		saddr = kl_kernelstack(task);
|>-		if (task == kl_dumptask()) {
|>-			eip = kl_dumpeip();
|>-			esp = kl_dumpesp();
|>+		if (kl_smp_dumptask(task)) {
|>+			eip = kl_dumpeip(task);
|>+			esp = kl_dumpesp(task);
|> 		} else {
|> 			if (LINUX_2_2_X(KL_LINUX_RELEASE)) {
|> 				eip = KL_UINT(K_PTR(tsp, "task_struct", "tss"),
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c
|>--- lkcd_cvs_orig/lkcdutils/libklib/arch/i386/kl_dump.c	Thu Oct 12 02:32:54 2000
|>+++ lkcd_cvs_new/lkcdutils/libklib/arch/i386/kl_dump.c	Mon Nov 26 13:11:08 2001
|>@@ -9,7 +9,7 @@
|> /*
|>  * get_dump_header()
|>  */
|>-static int
|>+int
|> get_dump_header(dump_header_t *dump_header)
|> {
|> 	/* first, make sure this isn't a live system
|>@@ -42,7 +42,7 @@
|> /*
|>  * get_dump_header_asm()
|>  */
|>-static int
|>+int
|> get_dump_header_asm(dump_header_asm_t *dump_header_asm)
|> {
|> 	dump_header_t dump_header;
|>@@ -90,36 +90,40 @@
|>  * kl_dumpesp()
|>  */
|> kaddr_t
|>-kl_dumpesp(void)
|>+kl_dumpesp(kaddr_t tsk)
|> {
|>-	dump_header_asm_t dump_header_asm;
|>+	dump_header_asm_t dha;
|>+	int i;
|>
|>-	if (get_dump_header_asm(&dump_header_asm)) {
|>+	if (get_dump_header_asm(&dha)) {
|> 		return((kaddr_t)NULL);
|> 	}
|>-	if (dump_header_asm.dha_regs.esp) {
|>-		return((kaddr_t)dump_header_asm.dha_regs.esp);
|>-	} else {
|>-		return((kaddr_t)dump_header_asm.dha_esp);
|>+
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (tsk == dha.dha_smp_current_task[i])
|>+			return (dha.dha_smp_regs[i].esp);
|> 	}
|>+	return((kaddr_t)NULL);
|> }
|>
|> /*
|>  * kl_dumpeip()
|>  */
|> kaddr_t
|>-kl_dumpeip(void)
|>+kl_dumpeip(kaddr_t tsk)
|> {
|>-	dump_header_asm_t dump_header_asm;
|>+	dump_header_asm_t dha;
|>+	int i;
|>
|>-	if (get_dump_header_asm(&dump_header_asm)) {
|>+	if (get_dump_header_asm(&dha)) {
|> 		return((kaddr_t)NULL);
|> 	}
|>-	if (dump_header_asm.dha_regs.eip) {
|>-		return((kaddr_t)dump_header_asm.dha_regs.eip);
|>-	} else {
|>-		return((kaddr_t)dump_header_asm.dha_eip);
|>+
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (tsk == dha.dha_smp_current_task[i])
|>+			return (dha.dha_smp_regs[i].eip);
|> 	}
|>+	return((kaddr_t)NULL);
|> }
|>
|> /*
|>@@ -134,5 +138,23 @@
|> 		return((kaddr_t)NULL);
|> 	}
|> 	return((kaddr_t)dump_header.dh_current_task);
|>+
|>+}
|>+
|>+int
|>+kl_smp_dumptask(kaddr_t tsk)
|>+{
|>+	dump_header_asm_t dha;
|>+	int i;
|>+
|>+	if (get_dump_header_asm(&dha)) {
|>+		return((kaddr_t)NULL);
|>+	}
|>+
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (dha.dha_smp_regs[i].eip > KL_PAGE_OFFSET && tsk == dha.dha_smp_current_task[i])
|>+			return (1);
|>+	}
|>+	return (0);
|> }
|>
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h
|>--- lkcd_cvs_orig/lkcdutils/libklib/include/asm-i386/dump.h	Wed Sep  5 13:38:00 2001
|>+++ lkcd_cvs_new/lkcdutils/libklib/include/asm-i386/dump.h	Mon Nov 26 15:41:09 2001
|>@@ -4,7 +4,8 @@
|>  * Created by: Matt Robinson (yakker@sgi.com)
|>  *
|>  * Copyright 1999 Silicon Graphics, Inc. All rights reserved.
|>- *
|>+ *
|>+ * This code is released under version 2 of the GNU GPL.
|>  */
|>
|> /* This header file holds the architecture specific crash dump header */
|>@@ -13,6 +14,7 @@
|>
|> /* necessary header files */
|> #include <asm/ptrace.h>                          /* for pt_regs             */
|>+#include <linux/threads.h>
|>
|> /* definitions */
|> #define DUMP_ASM_MAGIC_NUMBER     0xdeaddeadULL  /* magic number            */
|>@@ -44,17 +46,44 @@
|> 	/* the dump registers */
|> 	struct pt_regs       dha_regs;
|>
|>+	/* smp specific */
|>+	uint32_t	     dha_smp_num_cpus;
|>+	int		     dha_dumping_cpu;
|>+	struct pt_regs	     dha_smp_regs[NR_CPUS];
|>+	void *		     dha_smp_current_task[NR_CPUS];
|>+	void *		     dha_stack[NR_CPUS];
|> } dump_header_asm_t;
|>
|> #ifdef __KERNEL__
|>-extern void __dump_open(struct file *, uint64_t);
|>-extern void __dump_init(uint64_t);
|>-extern void __dump_silence_system(void);
|>-extern void __dump_resume_system(void);
|>-extern int __dump_configure_header(dump_header_asm_t *, struct pt_regs *);
|>-#ifdef CONFIG_X86
|>-extern void __dump_save_panic_regs(dump_header_asm_t *);
|>-#endif
|>+static inline void get_current_regs(struct pt_regs *regs)
|>+{
|>+	__asm__ __volatile__("movl %%ebx,%0" : "=m"(regs->ebx));
|>+	__asm__ __volatile__("movl %%ecx,%0" : "=m"(regs->ecx));
|>+	__asm__ __volatile__("movl %%edx,%0" : "=m"(regs->edx));
|>+	__asm__ __volatile__("movl %%esi,%0" : "=m"(regs->esi));
|>+	__asm__ __volatile__("movl %%edi,%0" : "=m"(regs->edi));
|>+	__asm__ __volatile__("movl %%ebp,%0" : "=m"(regs->ebp));
|>+	__asm__ __volatile__("movl %%eax,%0" : "=m"(regs->eax));
|>+	__asm__ __volatile__("movl %%esp,%0" : "=m"(regs->esp));
|>+	__asm__ __volatile__("movw %%ss, %%ax;" :"=a"(regs->xss));
|>+	__asm__ __volatile__("movw %%cs, %%ax;" :"=a"(regs->xcs));
|>+	__asm__ __volatile__("movw %%ds, %%ax;" :"=a"(regs->xds));
|>+	__asm__ __volatile__("movw %%es, %%ax;" :"=a"(regs->xes));
|>+	__asm__ __volatile__("pushfl; popl %0" :"=m"(regs->eflags));
|>+	regs->eip = (unsigned long)current_text_addr();
|>+
|>+}
|>+
|>+extern volatile int dump_in_progress;
|>+extern unsigned long irq_affinity[];
|>+extern dump_header_asm_t dump_header_asm;
|>+
|>+#ifdef CONFIG_SMP
|>+extern int (*dump_ipi_function_ptr)(struct pt_regs *);
|>+extern void dump_send_ipi(void);
|>+#else
|>+#define dump_send_ipi()
|> #endif
|>+#endif /* __KERNEL__ */
|>
|> #endif /* _ASM_DUMP_H */
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h
|>--- lkcd_cvs_orig/lkcdutils/libklib/include/kl_task.h	Thu Oct 12 02:32:54 2000
|>+++ lkcd_cvs_new/lkcdutils/libklib/include/kl_task.h	Mon Nov 26 13:17:16 2001
|>@@ -9,7 +9,8 @@
|> int kl_parent_pid(void *);
|> kaddr_t kl_pid_to_task(kaddr_t);
|> k_error_t kl_get_task_struct(kaddr_t, int, void *);
|>-kaddr_t kl_dumpeip(void);
|>-kaddr_t kl_dumpesp(void);
|>+kaddr_t kl_dumpeip(kaddr_t tsk);
|>+kaddr_t kl_dumpesp(kaddr_t tsk);
|>+int kl_smp_dumptask(kaddr_t tsk);
|> kaddr_t kl_dumptask(void);
|> kaddr_t kl_kernelstack(kaddr_t);
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c lkcd_cvs_new/lkcdutils/libklib/kl_memory.c
|>--- lkcd_cvs_orig/lkcdutils/libklib/kl_memory.c	Fri Nov 23 17:25:35 2001
|>+++ lkcd_cvs_new/lkcdutils/libklib/kl_memory.c	Mon Nov 26 13:15:58 2001
|>@@ -123,6 +123,34 @@
|> 	return((meminfo_t *)NULL);
|> }
|>
|>+extern int get_dump_header_asm(dump_header_asm_t *dha);
|>+kaddr_t
|>+__kl_fix_vaddr(kaddr_t vaddr, size_t sz)
|>+{
|>+	dump_header_asm_t dha;
|>+	kaddr_t cur_task;
|>+	int i;
|>+
|>+	if (MIP->core_type != reg_core) {
|>+		return vaddr;
|>+	}
|>+	if (get_dump_header_asm(&dha))
|>+		return vaddr;
|>+
|>+	/* this is a very simplistic check to see if we have saved
|>+	 * (snapshotted) this particular block. This is very limited
|>+	 * to finding the saved task structs only.
|>+	 */
|>+	for (i = 0; i < dha.dha_smp_num_cpus; i++) {
|>+		if (dha.dha_smp_regs[i].eip < KL_PAGE_OFFSET)
|>+			continue; /* if task is in user space, no need to look at saved stack */
|>+		cur_task = dha.dha_smp_current_task[i];
|>+		if (vaddr >= cur_task && vaddr + sz <  cur_task + KSTACK_SIZE)
|>+			return (dha.dha_stack[i] + (vaddr - cur_task));
|>+	}
|>+	return vaddr;
|>+}
|>+
|> /*
|>  * get_block()
|>  *
|>@@ -142,13 +170,16 @@
|> 		KL_ERROR = KLE_ZERO_SIZE;
|> 	} else {
|> 		while (size > 0){
|>+			kaddr_t tmp = vaddr;
|> 			s=((vaddr & KL_PAGE_MASK) | (~KL_PAGE_MASK)) -
|> 				vaddr + 1;
|> 			s= (size > s) ? s : size;
|>+			vaddr = __kl_fix_vaddr(vaddr, s);
|> 			if ( kl_virtop(vaddr, mmap, &paddr) ) {
|> 				return(KL_ERROR);
|> 			}
|> 			kl_readmem(paddr, s, bp);
|>+			vaddr = tmp;
|> 			size=size - s;
|> 			vaddr=vaddr + s;
|> 			bp=bp + s;
|>diff -urN -X dontdiff lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c
|>--- lkcd_cvs_orig/lkcdutils/lkcd_config/lkcd_config.c	Fri Nov 23 17:25:37 2001
|>+++ lkcd_cvs_new/lkcdutils/lkcd_config/lkcd_config.c	Mon Nov 26 16:35:23 2001
|>@@ -242,7 +242,7 @@
|>
|> 	/* set dump compression */
|> 	if (compress_set == DUMP_TRUE) {
|>-		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)&compress)) < 0) {
|>+		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)compress)) < 0) {
|> 			perror("ioctl() for dump compression failed");
|> 			close(dfd);
|> 			return (err);
|>@@ -251,7 +251,7 @@
|>
|> 	/* set dump flags */
|> 	if (flags_set == DUMP_TRUE) {
|>-		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)&flags)) < 0) {
|>+		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)flags)) < 0) {
|> 			perror("ioctl() for dump flags failed");
|> 			close(dfd);
|> 			return (err);
|>@@ -260,7 +260,7 @@
|>
|> 	/* set dump level */
|> 	if (level_set == DUMP_TRUE) {
|>-		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)&level)) < 0) {
|>+		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)level)) < 0) {
|> 			perror("ioctl() for dump level failed");
|> 			close(dfd);
|> 			return (err);
|>
|>_______________________________________________
|>Lkcd-general mailing list
|>Lkcd-general@lists.sourceforge.net
|>https://lists.sourceforge.net/lists/listinfo/lkcd-general
|>



From bharata@in.ibm.com Thu Nov 29 01:56:20 2001
Received: from e31.co.us.ibm.com ([32.97.110.129] helo=e31.bld.us.ibm.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169NvX-00048n-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 29 Nov 2001 01:56:19 -0800
Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.99.140.24])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id EAA53562;
	Thu, 29 Nov 2001 04:53:14 -0500
Received: from bharata.in.ibm.com (bharata.in.ibm.com [9.186.133.24])
	by westrelay03.boulder.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fAT9tkR66976;
	Thu, 29 Nov 2001 02:55:47 -0700
Received: (from bharata@localhost)
	by bharata.in.ibm.com (8.11.2/8.11.2) id fAT9ofI05133;
	Thu, 29 Nov 2001 15:20:41 +0530
From: Bharata B Rao <bharata@in.ibm.com>
To: "Matt D. Robinson" <yakker@aparity.com>
Cc: vamsi@in.ibm.com, lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net,
        Suparna <bsuparna@in.ibm.com>, subodh@in.ibm.com
Subject: Re: [lkcd-general] [PATCH]capturing registers/stack on all processors
Message-ID: <20011129152041.A4870@in.ibm.com>
Reply-To: bharata@in.ibm.com
References: <20011127143019.A8322@in.ibm.com> <Pine.LNX.4.30.0111290020130.29286-100000@nakedeye.aparity.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.LNX.4.30.0111290020130.29286-100000@nakedeye.aparity.com>; from yakker@aparity.com on Thu, Nov 29, 2001 at 12:38:35AM -0800
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 29 01:57:01 2001
X-Original-Date: Thu, 29 Nov 2001 15:20:41 +0530

On Thu, Nov 29, 2001 at 12:38:35AM -0800, Matt D. Robinson wrote:
> Looks good, Vamsi.  A couple of points:
> 
> - You'll need to fix the cmds.c file to deal with the NULL argument
>   at the end of the new commands (or it will conflict with Naomi-san's
>   latest code)

Looks like NULL characters are already present at the end of the command
list both in cmds.c and i386_cmds.c

> - Why hw_irq.h?  I didn't see any CONFIG_DUMP stuff in there ...

It defines a vector DUMP_VECTOR used by dump code to send NMIs.
We need to put #ifdef CONFIG_DUMP around it.

> - I think you need to add the printk("Dump "); for the default case and
>   add CONFIG_DUMP_MODULE for the 'd' command in sysrq.c (unless there's
>   a reason not to have it?
> 

For sysrq 'd' to work with dump module, we need to define a function 
pointer for show_this_cpu_state and use it in sysrq.c. Also we should
export some symbols. Anyway this can be done.

> My only other concern, which isn't that big, is that someone will
> complain if we try to add show_this_cpu_state() into 2.5, as it
> is mostly duplicate code.
> 

show_this_cpu_state() is a combination of show_stack and show_trace.
But show_stack/show_trace depend on stack being page aligned (when
they do limit checking). But show_this_cpu_state() looks at saved
stack which is not page aligned. Hence the duplication.

Apart from these points, do you think this is in a state to go into
cvs ?

> Great job, Vamsi.  Are all of you on #lkcd?
Yes, we are present on #lkcd.

Thanks for your comments.
Regards,
Bharata.
> 
> --Matt
> 


From lkcd-general-owner@lists.sourceforge.net Thu Nov 29 01:56:31 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169NvK-00048o-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 29 Nov 2001 01:56:06 -0800
Received: from e31.bld.us.ibm.com (e31.co.us.ibm.com [32.97.110.129])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fATAu6o08356
	for <lkcd@oss.sgi.com>; Thu, 29 Nov 2001 02:56:06 -0800
Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.99.140.24])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id EAA53562;
	Thu, 29 Nov 2001 04:53:14 -0500
Received: from bharata.in.ibm.com (bharata.in.ibm.com [9.186.133.24])
	by westrelay03.boulder.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fAT9tkR66976;
	Thu, 29 Nov 2001 02:55:47 -0700
Received: (from bharata@localhost)
	by bharata.in.ibm.com (8.11.2/8.11.2) id fAT9ofI05133;
	Thu, 29 Nov 2001 15:20:41 +0530
From: Bharata B Rao <bharata@in.ibm.com>
To: "Matt D. Robinson" <yakker@aparity.com>
Cc: vamsi@in.ibm.com, lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net,
   Suparna <bsuparna@in.ibm.com>, subodh@in.ibm.com
Subject: Re: [lkcd-general] [PATCH]capturing registers/stack on all processors
Message-ID: <20011129152041.A4870@in.ibm.com>
Reply-To: bharata@in.ibm.com
References: <20011127143019.A8322@in.ibm.com> <Pine.LNX.4.30.0111290020130.29286-100000@nakedeye.aparity.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.LNX.4.30.0111290020130.29286-100000@nakedeye.aparity.com>; from yakker@aparity.com on Thu, Nov 29, 2001 at 12:38:35AM -0800
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 29 01:57:03 2001
X-Original-Date: Thu, 29 Nov 2001 15:20:41 +0530

On Thu, Nov 29, 2001 at 12:38:35AM -0800, Matt D. Robinson wrote:
> Looks good, Vamsi.  A couple of points:
> 
> - You'll need to fix the cmds.c file to deal with the NULL argument
>   at the end of the new commands (or it will conflict with Naomi-san's
>   latest code)

Looks like NULL characters are already present at the end of the command
list both in cmds.c and i386_cmds.c

> - Why hw_irq.h?  I didn't see any CONFIG_DUMP stuff in there ...

It defines a vector DUMP_VECTOR used by dump code to send NMIs.
We need to put #ifdef CONFIG_DUMP around it.

> - I think you need to add the printk("Dump "); for the default case and
>   add CONFIG_DUMP_MODULE for the 'd' command in sysrq.c (unless there's
>   a reason not to have it?
> 

For sysrq 'd' to work with dump module, we need to define a function 
pointer for show_this_cpu_state and use it in sysrq.c. Also we should
export some symbols. Anyway this can be done.

> My only other concern, which isn't that big, is that someone will
> complain if we try to add show_this_cpu_state() into 2.5, as it
> is mostly duplicate code.
> 

show_this_cpu_state() is a combination of show_stack and show_trace.
But show_stack/show_trace depend on stack being page aligned (when
they do limit checking). But show_this_cpu_state() looks at saved
stack which is not page aligned. Hence the duplication.

Apart from these points, do you think this is in a state to go into
cvs ?

> Great job, Vamsi.  Are all of you on #lkcd?
Yes, we are present on #lkcd.

Thanks for your comments.
Regards,
Bharata.
> 
> --Matt
> 


From lkcd-general-owner@lists.sourceforge.net Thu Nov 29 02:13:18 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169OBQ-0007Hz-00
	for <lkcd-general@lists.sourceforge.net>; Thu, 29 Nov 2001 02:12:44 -0800
Received: from e21.nc.us.ibm.com (e21.nc.us.ibm.com [32.97.136.227])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fATBCjo08940
	for <lkcd@oss.sgi.com>; Thu, 29 Nov 2001 03:12:45 -0800
Received: from southrelay03.raleigh.ibm.com (southrelay03.raleigh.ibm.com [9.37.3.210])
	by e21.nc.us.ibm.com (8.9.3/8.9.3) with ESMTP id EAA138128
	for <lkcd@oss.sgi.com>; Thu, 29 Nov 2001 04:09:35 -0600
Received: from sunixs.in.ibm.com (sunixs.in.ibm.com [9.186.133.23])
	by southrelay03.raleigh.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fATACdS136672
	for <lkcd@oss.sgi.com>; Thu, 29 Nov 2001 05:12:40 -0500
Received: (from subodh@localhost)
	by sunixs.in.ibm.com (8.11.2/8.11.2) id fATA6oA23044
	for lkcd@oss.sgi.com; Thu, 29 Nov 2001 15:36:50 +0530
From: Subodh Soni <subodh@in.ibm.com>
To: lkcd@oss.sgi.com
Message-ID: <20011129153650.C23031@in.ibm.com>
Reply-To: subodh@in.ibm.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Subject: [lkcd-general] Subscribe to LKCD Mailing List
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Thu Nov 29 02:14:03 2001
X-Original-Date: Thu, 29 Nov 2001 15:36:50 +0530

Subscribe to LKCD mailing list.


From lkcd-general-owner@lists.sourceforge.net Fri Nov 30 00:05:17 2001
Received: from oss.sgi.com ([216.32.174.27])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169ifc-0005PW-00
	for <lkcd-general@lists.sourceforge.net>; Fri, 30 Nov 2001 00:05:16 -0800
Received: from deliverator.sgi.com (deliverator.sgi.com [204.94.214.10])
	by oss.sgi.com (8.11.2/8.11.3) with SMTP id fAU95Ho31629
	for <lkcd@oss.sgi.com>; Fri, 30 Nov 2001 01:05:17 -0800
Received: from loco.csd.sgi.com (loco.csd.sgi.com [130.62.73.130]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id AAA21812
	for <lkcd@oss.sgi.com>; Fri, 30 Nov 2001 00:05:05 -0800 (PST)
	mail_from (tjm@sgi.com)
Received: from striker (mtv-vpn-hw-tjm-2.corp.sgi.com [134.15.18.147]) by loco.csd.sgi.com (980427.SGI.8.8.8/970903.SGI.AUTOCF) via SMTP id XAA15072; Thu, 29 Nov 2001 23:59:50 -0800 (PST)
Message-ID: <004a01c17977$23379980$93120f86@corp.sgi.com>
From: "Tom Morano" <tjm@sgi.com>
To: "Matt D. Robinson" <yakker@aparity.com>,
   "Suparna Bhattacharya" <bsuparna@in.ibm.com>
Cc: "Andreas_Herrmann/Germany/IBM%IBMDE" <aherrman@de.ibm.com>,
   <lkcd@oss.sgi.com>, <lkcd-general-admin@lists.sourceforge.net>
References: <Pine.LNX.4.30.0111280138160.28096-100000@nakedeye.aparity.com>
Subject: Re: [lkcd-general] dump and highmem
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Fri Nov 30 00:06:01 2001
X-Original-Date: Fri, 30 Nov 2001 00:15:12 -0800

Matt,

I;ve made a change to the lcrash/vmdump.c file such that the dp_address
value is now derived from the kl_virtop() function. I realize that this is a
live_dump and that it's kind of a hacky way to get a dump from a live
system. Nevertheless, it does bring up an important point. That is, how do
we represent memory pages in the dump when we're dealing with something like
HIGH_MEMORY?. If we try and use a high_memory virtual address, we will be in
direct conflict with our use of physical addresses everywhere else. I'm not
that familiar with the HIGH_MEMORY code, but is it even possible to access
high memory via a physical address? Perhaps we need a universal way of
referring to a physical memory address in the dump page headers. If we
consider the physical address space for all systems to be contiguous
(except, of course, when there are holes in memory), then we should be able
to reference any memory page via an offset to the start of that page (from
start of memory or address zero). This approach will work as long as lcrash
knows how to map virtual addresses to those physical page offsets. For
systems like ia64, it's a simple matter of subtracting PAGE_OFFSET. For
systems where some or all of kernel virtual memory is mapped (not directly
to physical memory), then some special handling in the kl_virtop() function
has to handle the translation. This is certainly an issue that is open to
discussion. I just wanted to throw these thoughts out there.

Tom

----- Original Message -----
From: "Matt D. Robinson" <yakker@aparity.com>
To: "Suparna Bhattacharya" <bsuparna@in.ibm.com>
Cc: "Andreas_Herrmann/Germany/IBM%IBMDE" <aherrman@de.ibm.com>;
<lkcd@oss.sgi.com>; <lkcd-general-admin@lists.sourceforge.net>; "Tom Morano"
<tjm@sgi.com>
Sent: Wednesday, November 28, 2001 1:46 AM
Subject: Re: [lkcd-general] dump and highmem


> On Wed, 28 Nov 2001, Suparna Bhattacharya wrote:
> |>I haven't looked into lcrash much, but I thought you'd need a unique
value
> |>of dp.dp_address per page (even for high mem pages) . How would using
the
> |>vaddr returned by kmap_atomic work  ?
>
> You do, and you're right.  max_mapnr already accounts for the
> highend_pfn value with CONFIG_HIGHMEM, so the mem_loc is the
> right value to use for dp_address in all cases.  I just went
> and looked at this again in the init.c/setup.c arch code.
>
> In any event, the code is back to the way it was.  Sorry for
> any additional confusion, I'll stop now before I make things
> worse. :)
>
> --Matt
>
>
> _______________________________________________
> Lkcd-general mailing list
> Lkcd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/lkcd-general
>



From castor@3pardata.com Fri Nov 30 11:35:49 2001
Received: from dnai-216-15-110-218.cust.dnai.com ([216.15.110.218] helo=mail.3pardata.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 169tRt-0006fH-00
	for <lkcd-general@lists.sourceforge.net>; Fri, 30 Nov 2001 11:35:49 -0800
Received: from postal.3pardata.com (3pardata.com [192.168.1.19])
	by mail.3pardata.com (8.9.3+Sun/8.9.3) with ESMTP id LAA08870
	for <lkcd-general@lists.sourceforge.net>; Fri, 30 Nov 2001 11:35:43 -0800 (PST)
Received: from marais (marais.3pardata.com [192.168.1.107]) by postal.3pardata.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13)
	id XH43W9FK; Fri, 30 Nov 2001 11:35:44 -0800
From: Castor Fu <castor@3pardata.com>
X-X-Sender:  <castor@marais>
To: <lkcd-general@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.33.0111301133420.31849-100000@marais>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [lkcd-general] lkcd_config typos
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Fri Nov 30 11:36:06 2001
X-Original-Date: Fri, 30 Nov 2001 11:35:43 -0800 (PST)

There are some fairly trivial typos in lkcd_config dealing with
passing an address instead of value in ioctls.  People must have
fixed them by now, but it seems strange that they're not in the
tree.  I'm attaching the patch below:


Index: lkcd_config.c
===================================================================
RCS file: /cvsroot/lkcd/lkcdutils/lkcd_config/lkcd_config.c,v
retrieving revision 1.9
diff -u -r1.9 lkcd_config.c
--- lkcd_config.c	2001/11/19 22:02:09	1.9
+++ lkcd_config.c	2001/11/30 19:33:07
@@ -242,7 +242,7 @@

 	/* set dump compression */
 	if (compress_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)&compress)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPCOMPRESS, (caddr_t)compress)) < 0) {
 			perror("ioctl() for dump compression failed");
 			close(dfd);
 			return (err);
@@ -251,7 +251,7 @@

 	/* set dump flags */
 	if (flags_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)&flags)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPFLAGS, (caddr_t)flags)) < 0) {
 			perror("ioctl() for dump flags failed");
 			close(dfd);
 			return (err);
@@ -260,7 +260,7 @@

 	/* set dump level */
 	if (level_set == DUMP_TRUE) {
-		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)&level)) < 0) {
+		if ((err = ioctl(dfd, DIOSDUMPLEVEL, (caddr_t)level)) < 0) {
 			perror("ioctl() for dump level failed");
 			close(dfd);
 			return (err);


-- 



From yakker@aparity.com Sat Dec 01 16:11:58 2001
Received: from w032.z064001165.sjc-ca.dsl.cnc.net ([64.1.165.32] helo=nakedeye.aparity.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 16AKEa-0000xS-00
	for <lkcd-general@lists.sourceforge.net>; Sat, 01 Dec 2001 16:11:52 -0800
Received: from localhost (yakker@localhost)
	by nakedeye.aparity.com (8.11.2/8.11.2) with ESMTP id fB20FpG32757
	for <lkcd-general@lists.sourceforge.net>; Sat, 1 Dec 2001 16:15:52 -0800
From: "Matt D. Robinson" <yakker@aparity.com>
To: <lkcd-general@lists.sourceforge.net>
Message-ID: <Pine.LNX.4.30.0112011547050.32725-100000@nakedeye.aparity.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [lkcd-general] LKCD Update ...
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Sat Dec  1 16:12:09 2001
X-Original-Date: Sat, 1 Dec 2001 16:15:51 -0800 (PST)

Just an update for the rest of the developers out there:

- The IBM folks in India are going to check in their latest set of
  code; that should be the final base for 4.0.1.  This should be done
  by Monday, if all goes well.

- After that, 4.0.1 will continue to move forward as-is, but the
  2.4 kernel tree will branch off to a 2.5 directory, so that some
  of the future dump device mechanisms can go in, as well as
  (maybe) Linus accepting the kernel code base.  We'll see.

- After talking with Tom Morano this week, we think there's still
  some work to do with the highmem stuff in lcrash.  I'm going
  to work on finishing this up, but it won't be for 4.0.1.

- Is there anything else that needs to go into 4.0.1 that's
  critical?

--Matt



From bsuparna@in.ibm.com Sun Dec 02 23:55:19 2001
Received: from ausmtp02.au.ibm.com ([202.135.136.105])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 16Anwa-0007Op-00
	for <lkcd-general@lists.sourceforge.net>; Sun, 02 Dec 2001 23:55:16 -0800
Received: from f02n15e.au.ibm.com 
        by ausmtp02.au.ibm.com (IBM AP 2.0) with ESMTP id fB37nt2221534;
        Mon, 3 Dec 2001 18:49:55 +1100
Received: from d23m0062.in.ibm.com (d23m0062.in.ibm.com [9.184.199.181])
	by f02n15e.au.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fB37rVH58210;
	Mon, 3 Dec 2001 18:53:33 +1100
X-Priority: 1 (High)
Subject: Re: [lkcd-general] [Proposal] dump dedicated driver with polling method
To: Ken-ichi Matsuoka <matsuoka@css1.kbnes.nec.co.jp>
Cc: lkcd-general@lists.sourceforge.net, r-shibano@pb.jp.nec.com
X-Mailer: Lotus Notes Release 5.0.3 (Intl) 21 March 2000
Message-ID: <OF1C7936CD.9354F835-ON65256B16.002D2A87@in.ibm.com>
From: "Suparna Bhattacharya" <bsuparna@in.ibm.com>
X-MIMETrack: Serialize by Router on d23m0062/23/M/IBM(Release 5.0.8 |June 18, 2001) at
 03/12/2001 01:24:53 PM
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Sun Dec  2 23:56:01 2001
X-Original-Date: Mon, 3 Dec 2001 13:24:47 +0530

Thanks to Ken-ichi san for sharing the NEC proposal, and taking the time to
explain the reasoning behind it.

This seems like the right time to revive the discussion on "minimal system
dependence" dumping (i.e. almost "standalone" dumping) and about the dump
driver interface that we have talked about on this forum some time back. It
would be nice to agree on a common approach/interface.

A. Since it usually helps to start from the goals/end-requirements:

Reliable dumping in this connotation implies multiple things:
1. Ability to successfully generate the dump, no matter what the state of
the system and the point of triggering dump is.
2. Avoid risks to system integrity as a result of not shutting down the
system completely right when panic happens, but attempting the dump to disk
in that situation, no matter how damaged system state is.
3. Ensuring the accuracy of the dumped information in terms of reflecting
the relevant state exactly as of the instant where the problem was detected
and dump was triggered.

Goal 2 applies only to critical system crash/panic situations where we
undertake disruptive dumps (where the system must be brought down or
rebooted after dump is taken), and is considered to be a concern with
dumping via software in such situations.
[In the case of non-critical problem situations where we  take a
non-disruptive dump (where the system operation continues after the dump,
without needing to force a reboot), the corresponding requirement is
ensuring that the system continues working after dump, i.e. avoiding risks
to system availability or reliability after dumping.]

Trying to cover these goals would typically involve:
(a) Shutting down (quiescing) system activity - not just the CPU, but also
device activity which could impact system state  => towards 2 & 3 above.
(b) Ensuring minimal system dependence during dumping
     -- avoids relying on system state (undetermined or even potentially
corrupted)  => towards Goals 1 & 2
     -- avoids modifying system state (or saves it before modifying it) =>
towards Goals 3

Well, the ideal solution which addresses all these goals completely in
every possible situation, would require the dump to be taken by a separate
entity or at a different level (e.g. with special hardware support) and not
from within the same system that panic'ed. For a practical solution for
dump triggered by the OS, in the absence of such capabilities, the scope
may need to be cut down to a best effort attempt to get an accurate dump
with minimal risks of system integrity or data corruption (prioritizing the
above goals in the order 2, 1 and 3 as far as panic dumps are concerned).
We might need to clearly define what "best effort" means or the kind of
situations where the goals could be violated.

B. Now with reference to what today's lkcd implementation has :

Regarding (a):
The focus so far has been on normal CPU execution quiescing (stops
scheduling, stops other CPUs, thus also involves stopping the
flow/generation of new i/o requests), but interrupts are not disabled
because interrupts are still used for dumping using the current i/o
mechanism. Softirqs are enabled as well. Implications:
      i.  Ongoing device activity (e.g DMAs) not aborted.
      ii. Network interfaces not shutdown =>network interrupts could come
in.
      iii.Timers not disabled (may be needed by dump driver) => other
timers could execute
      iv.Interrupt handlers/Softirq routines could potentially process
previously queued requests generating traffic to the device, causing future
interrupts

(BTW, we did consider enabling only disk i/o related interrupts, and
disabling the rest, similar to 3.1 option (3) in NECs proposal, but
eventually a way to avoid interrupts altogether using a dump driver type
approach is what we were looking at too )

Regarding (b):
There is a dependence on raw i/o code, i.e. low level block driver,
interrupts etc. If dump happens to a device that was in use before dump
(e.g swap), then pending i/os may be serviced before dump. However, with a
dedicated dump device configured (instead of swap), the request queue (and
scsi cmd queue) is expected to be separate (reduces the impact of Cause 2
in NECs proposal).

C. With reference to NEC's proposal:

It certainly attempts to address  Goal 3 above. It may also help with Goals
1 & 2.
By using the basic request queue and scsi cmd queue infrastructure, rather
than requiring a separate standalone dump driver for each device, it
reduces some duplication of code in the dedicated dump driver, and instead
plugs in this new interface, bypassing some portions of the typical i/o
path and switching to dump specific path, while reusing some of the i/o
infrastructure. This is interesting.  What we need to understand is if this
approach can avoid all data dependency/sharing with the normal i/o path for
all scsi drivers, and the extent of chip specific code that is needed for
supporting this interface. Would we need to ensure a stricter design
philosphy with regard to global data usage by drivers ? (For example having
a per queue lock instead of the global io_request_lock in the 2.5 block
layer is a positive shift in this direction)

(Note: This approach is distinct from Castor Fu's suggestion related to
keeping this completely standalone w.r.t the linux kernel)

D. Now going back to the dump driver interface that we've had under
consideration:

Matt's suggestion was to extend the block device operations to include
dump. (We may consider extending this to other device types too, unless we
want to force a block interface on top of those). Along the lines of
similar interfaces used in AIX and IRIX, and possibly other platforms as
well, the following sub-parts were identified. These are described  here as
separate routines for simplicity of representation/illustration, but could
potentially be clubbed with other things or reorganized a bit. (Some of
these may reduce to NOPs in certain situations). We could even have just a
single bd_ops->dump() function do all of this, though it may turn out to be
useful to keep this split into separate functions.

dev_dump_open()  (or dev_dump_init())
     Invoked when the dump device is configured. Sets asides/allocates dump
specific resources for the device

dev_dump_start()
     Invoked by the generic dump code before starting dump i/o to the
device. Would make the device ready for dump. This may involving forcing a
reset/re-init of the device if necessary, suspending current i/o, switching
the request queues etc.  (May be a nop in case of a separate dedicated
device)

dev_dump_write()
     Used by the generic dump code to actually write dump data. Does not
wait for the write to complete, and may return an appropriate status if it
cannot queue/send out the entire data, so that the dumper can pass it again
when the earlier write is done.

dev_dump_ready()
     Used by the generic dump code to check if the device has completed the
last write and is ready for a fresh write request. The dumper may specify a
wait option with a timeout, to wait (in polling mode) until the device is
ready, or may choose to handle the wait poll itself.

dev_dump_end()
     Just the reverse of dev_dump_start(). It is called by the generic dump
code, once it has finished dumping. It might re-enable the device for
normal i/o (switch requests back).

dev_dump_close()  (or dev_dump_term())
     Counterpart of dev_dump_open(). Invoked when unconfiguring a dump
device. Releases dump specific resources that had been set aside for the
device.

[Optional] dev_dump_query()/dev_dump_status()
     Just to query status or other information about the dump device


The generic dumper code could handle the polling itself using the
dev_dump_ready() option to check for i/o completion. This provides it with
some flexibility in terms of handling watchdog timer refreshes or other
generic activities that may need to be done in the poll loop, and possibly
use multiple dump devices (which can take on i/o requests simultaneously)
at a generic level.

Ken-ichi-san, is the interface you have in mind between the dump upper
driver and chip specific drivers similar in nature to the above ? Or is it
a very different approach ? Did you  intend to address Goals 1 & 2 as well
in your design ? Would the approach work unirformly for both disruptive and
non-disruptive dumps ?

Regards
Suparna


  Suparna Bhattacharya
  Linux Technology Center
  IBM Software Lab, India
  E-mail : bsuparna@in.ibm.com
  Phone :  91-80-5044961



                                                                                                                                   
                    Ken-ichi Matsuoka                                                                                              
                    <matsuoka@css1.kbnes.nec.co.jp       To:     lkcd@oss.sgi.com, lkcd-general@lists.sourceforge.net              
                    >                                    cc:     r-shibano@pb.jp.nec.com                                           
                    Sent by:                             Subject:     [lkcd-general] [Proposal] dump dedicated driver with polling 
                    lkcd-general-admin@lists.sourc        method                                                                   
                    eforge.net                                                                                                     
                                                                                                                                   
                                                                                                                                   
                    11/29/01 11:24 AM                                                                                              
                                                                                                                                   
                                                                                                                                   




Hi, all

To dump the memory image when crashing on the badly damaged system,
we think reliable dump I/O is required at first.

But current LKCD unfortunately cannot dump the memory image when crashing,
so we cannot investigate crash dump. (The detail is below.)

To solve this issue, we propose the dump dedicated driver, which uses
polling
method, manages necessary buffers by itself.

Any comment is welcomed.


0. Table of Contents
  1. The abstract of proposal
    1.1. Purpose
    1.2. LKCD's problem
    1.3. Our proposal
    1.4. How to proceed
    1.5. Schedule

  2. The details of problems
    2.1. The case of the cause 1
    2.2. The case of the cause 2
  3. The details of proposal
    3-1. The idea to the cause 1
    3-2. The idea to the cause 2


The following is the abstract of proposal:

1. The abstract of proposal
  1.1. The purpose
       To dump the memory image when crashing on the badly damaged system,
       We aim the following:

         We advance LKCD's implementation so that LKCD can dump the memory
         image in any status of the system.

  1.2. LKCD's problem

       There are the following cases that we cannot investigate crash dump:

       [Issue]
          LKCD cannot dump the memory image when crashing, so we cannot
          investigate crash dump.

       The cause of this issue is linked to below.

       (Cause 1) LKCD doesn't disable the interruption during dump I/O,
                 so the interrupt handlers of other drivers are processed.
                 The interrupt handler modifies kernel resources during
                 dump I/O.

       (Cause 2) LKCD uses original raw I/O.
                 Raw I/O uses kernel resources, so the following happens:

                 (A) Kernel resources are modified during dump I/O.
                 (B) If the dumped device has already had I/O requests,
                     these requests are processed during dump I/O.

  1.3. Our proposal
       The followings are our ideas to implement above causes:

         (Against 1) dump driver with polling method
         (Against 2) LKCD personally allocates the resources of I/O
requests
                     and manages them by replacing the original ones.

       To do, We propose dump dedicated driver, which uses polling method,
       manages necessary buffers by itself.


  1.4. How to proceed
       We proceed the following:

       1. Driver Developer develops the dedicated driver by chip.
          To fix the interface between the dump upper driver
          and the dedicated driver by chip is required.
          At first, We make the specification draft.

       2. We provide the specification draft to LKCD-ML,
          discuss  with you, and fix it.

  1.5. Schedule
       At the end of Dec. 2001 Specification Draft
                               : We provide the specification draft to
                                 LKCD-ML.
       At the end of Apr. 2002 Specification Rev. 1
                               : Fix specification with you

2. The details of problems
  2.1. The case of the cause 1
       (Modified kernel resources in the interruption processing)

       Case1-1. Timer list are modified.
              (What happens)
                Timer routine is called by timer_bh which is the bottom
half
                handler of timer interruption.
                And timer function of the device driver is processed in
timer
                routine, so the status or control flags of device driver
are
                modified by the timer function.

              (What troubles)
                The status or control flags of device driver are modified,
so
                you cannot grasp the status of the device driver at the
time
                of software failure and cannot investigate the cause of it.

                For example:
                  When the network driver has a trouble, the status of
                  network driver is modified during dump I/O.
                  So you cannot investigate.

       Case1-2. The resources of the device driver which uses software
                interrupt handler are modified.

              (What happens)
                Software interrupt handler is called, so the resources of
                the device driver using it are modified.

              (What troubles)
                You cannot investigate the resources of the device driver
                using software interrupt handler.

                For example, when software interrupt handler for network
                receiving routine is triggered, the receive routine is
                processed, and the status of socket are modified.
                So you cannot investigate them.

  2.2. The case of the cause 2
       (Modified kernel resources in the raw I/O processing)

       2.2.1. The case of the cause 2-(A)

       Case2-1. The resources of the "struct request" are modified.

              (What happens)
                 These resources are allocated by the device driver
                 at the initial phase, and are managed by Kernel block I/O.
                 But dump I/O processes the following processes, so these
ones
                 are modified.

                   By dump I/O, the "struct request" is taken from free
                   request queue (in get_request), and this one is
                   enqueued to the request queue of the device driver
                   (in add_request). So these request queues are modified.

              (What troubles)
                 At the error of I/O routine, you cannot know what request
is
                 the current request and what the status of request is.

       Case2-2. The resources of the "struct scsi_cmnd" are modified.
              (What happens)
                 These resources are allocated by SCSI driver at the
                 initializing phase, and are managed by SCSI driver.
                 But dump I/O processes the following processes, so these
ones
                 are modified.

                   After enqueuing request by dump I/O, the "struct
scsi_cmnd"
                   is taken from free list (in scsi_allocate_device),
                   and is dispatched to low-level driver (in
scsi_dispatch_cmd).
                   So the resources of the "struct scsi_cmnd" are modified.

              (What troubles)
                  At the error of I/O routine, you cannot know what command
is
                  processed and what the status of the command is.

       2.2.2. The case of the cause 2-(B)

       Case2-3. The resources of the "struct request" and the
                "struct scsi_cmnd" are modified.
              (What happens)
                 If the dumped device has already had I/O requests, that
is,
                 if the dumped device has already enqueued some requests,
                 these ones are processed by dump I/O.

                 For example, in the case that the disk connected in H/W
raid
                 has I/O trouble, if you want to dump memory to SCSI disk,
                 the requests enqueued in SCSI disk has processed.

              (What troubles)
                 By dump I/O, the status of requests are modified, you
cannot
                 investigate the status of requests when crashing.


3. The details of proposal

We propose the following ideas in order not to happen these cases:

  3-1. The idea to the cause 1

       [Proposal] dump driver with polling method

       The background of this proposal is below.

       To implement the cause 1, the I/O method which doesn't modify
       kernel resources in interrupt processing is required.
       We have considered 5 methods:

        (1) Disable all interruption by the processor function,
            such as clear interrupt flag (cli) routine on IA-32.
        (2) Disable all interruption by the interrupt controller.
        (3) Disable the interruption except for timer interruption
            and the interruption for dump I/O.
        (4) In addition to (3), remove the timer routine
            and software interrupt handler which are independent
            of dump I/O.
        (5) Use of polling mechanism, Disable all interruption

       [The advantage and fault of each method]
        (1) You cannot disable the interruption, because kernel
            or the device driver has the routine which enables
            the interruption.
            That is, Modification of kernel resources happens.
        (2) Modification of kernel resources doesn't happen.
            But you cannot execute waiting routine of I/O
            completion and the routine of I/O timeout, because
            all interruption is disabled.
        (3) Modification of kernel resources happens, because
            timer routine and software interrupt handler are
            called.
        (4) Modification of kernel resources doesn't happen.
            To do this method, you need the following processes,
            But these processes are unreal.

            (A) Disable timer routine except for dump I/O routine
            (B) Disable software interrupt handlers except for
                timer_bh and scsi_bottom_half_handler

        (5) Above problems doesn't happen, because in this method,
            dedicated driver manage all I/O routine such as
            waiting routine of I/O completion, the routine of
            I/O timeout.

       [The result of our consideration]
        To implement the cause 1, I propose the method (5).


  3-2. The idea to the cause 2
       The following idea is the implementation against the cause 2:

       [Proposal] LKCD personally allocates the resources of I/O requests
and
                  manages them by replacing the original ones.

       The background of this proposal is below.

       To implement the cause 2, not to use kernel resources is required.

       We consider the following idea:

       [How to proceed]
        Replace the original resources linked in the device driver
        with the resources allocated by LKCD.

       [To do]
        The following processes are required to change resources:

        (1) During the configuration of dump system, allocate the resources
            which will be replaced original resources with.
        (2) The queue linked in the device driver (such as request)
            is depended on the state of chip very much, because
            not only Kernel I/O but also the device driver uses
            this queue.
            But it is difficult to grasp the state of the chip.
            So to reset device driver before dump I/O is required.
        (3) Save all resources and replace them with the resources
            allocated by LKCD.

       [Implementation of each resource]

        Case 1. The resources of the "struct request" are modified.
          (1) During the configuration of dump system (dump_open_kdev),
              allocate the resources.
          (2) Before dump I/O, reset the device driver.
          (3) Save all request queue linked in the device driver.
          (4) Replace the request queue with the allocated resources.
          (5) Kernel I/O manages these ones.

        Case 2. The resources of the "struct scsi_cmnd" are modified.
          (1) During the configuration of dump system (dump_open_kdev),
              allocate the resources.
          (2) Before dump I/O, reset the device driver.
          (3) Save all scsi_cmnd queue linked in scsi driver.
          (4) Replace the scsi_cmnd queue with the allocated resources.
          (5) SCSI driver manages these ones.

        Case 3. The resources of the "struct request" and the
                "struct scsi_cmnd" are modified.

          Case3 is implemented by the ideas for Case1, 2.


best regards,

================================================================
 Kenichi Matsuoka                  2nd Engineering Department
 matsuoka@css1.kbnes.nec.co.jp     Computers Software Division
 (k-matsuoka@pd.jp.nec.com)        2nd Operations Unit
 Tel +81 78-991-5578               NEC System Technologies, Ltd.
 FAX +81 78-992-5080
================================================================


_______________________________________________
Lkcd-general mailing list
Lkcd-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lkcd-general





From bharata@in.ibm.com Mon Dec 03 03:06:18 2001
Received: from e31.co.us.ibm.com ([32.97.110.129] helo=e31.bld.us.ibm.com)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 16AqvR-0005YJ-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 03 Dec 2001 03:06:17 -0800
Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.99.140.24])
	by e31.bld.us.ibm.com (8.9.3/8.9.3) with ESMTP id GAA94576
	for <lkcd-general@lists.sourceforge.net>; Mon, 3 Dec 2001 06:03:05 -0500
Received: from bharata.in.ibm.com (bharata.in.ibm.com [9.186.133.24])
	by westrelay03.boulder.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fB3B5lU182874
	for <lkcd-general@lists.sourceforge.net>; Mon, 3 Dec 2001 04:05:48 -0700
Received: (from bharata@localhost)
	by bharata.in.ibm.com (8.11.2/8.11.2) id fB3AxnX22345
	for lkcd-general@lists.sourceforge.net; Mon, 3 Dec 2001 16:29:49 +0530
From: Bharata B Rao <bharata@in.ibm.com>
To: lkcd-general@lists.sourceforge.net
Message-ID: <20011203162949.B22020@in.ibm.com>
Reply-To: bharata@in.ibm.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Subject: [lkcd-general] lcrash failing to come up
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Dec  3 03:07:02 2001
X-Original-Date: Mon, 3 Dec 2001 16:29:49 +0530

Tried the latest lkcd and lcrash from sourceforge cvs and found that the
lkcdutils is not working.
When trying to run lcrash on a dump file, I get the following error:
------------------
Please wait...
        Initializing vmdump access ... Done.
        Loading system map ............................... Done.
        Initializing arch specific data ... Failed.
map.1: not found in dump file
------------------
Found that it is failing in the following function:
libklib/klib.c:kl_init_klib():kl_arch_init():_init_high_memory():kl_readmem()

I am using 2way smp m/c, i386 arch, 128MB memory.

Any ideas about the problem ?

Regards,
Bharata.
-- 
Bharata B Rao,
IBM Linux Technology Center,
IBM Software Lab, Bangalore.

Ph: 91-80-5262355 Ex: 3962
Mail: bharata@in.ibm.com


From AHERRMAN@de.ibm.com Mon Dec 03 05:35:27 2001
Received: from d12lmsgate.de.ibm.com ([195.212.91.199])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian))
	id 16AtFj-00019N-00
	for <lkcd-general@lists.sourceforge.net>; Mon, 03 Dec 2001 05:35:23 -0800
Received: from d12relay01.de.ibm.com (d12relay01.de.ibm.com [9.165.215.22])
	by d12lmsgate.de.ibm.com (1.0.0) with ESMTP id OAA77942
	for <lkcd-general@lists.sourceforge.net>; Mon, 3 Dec 2001 14:34:58 +0100
Received: from d12ml033.de.ibm.com (d12ml033_cs0 [9.165.223.11])
	by d12relay01.de.ibm.com (8.11.1m3/NCO v5.01) with ESMTP id fB3DYqu36326;
	Mon, 3 Dec 2001 14:34:59 +0100
Subject: Re: [lkcd-general] lcrash failing to come up
To: bharata@linux.ibm.com
Cc: lkcd-general@lists.sourceforge.net
X-Mailer: Lotus Notes Release 5.0.4a  July 24, 2000
Message-ID: <OF3D149951.7CB20F85-ONC1256B17.0048082D@de.ibm.com>
From: "Andreas Herrmann" <AHERRMAN@de.ibm.com>
X-MIMETrack: Serialize by Router on D12ML033/12/M/IBM(Release 5.0.8 |June 18, 2001) at
 03/12/2001 14:35:03
MIME-Version: 1.0
Content-type: text/plain; charset=us-ascii
Sender: lkcd-general-admin@lists.sourceforge.net
Errors-To: lkcd-general-admin@lists.sourceforge.net
X-BeenThere: lkcd-general@lists.sourceforge.net
X-Mailman-Version: 2.0.5
Precedence: bulk
List-Help: <mailto:lkcd-general-request@lists.sourceforge.net?subject=help>
List-Post: <mailto:lkcd-general@lists.sourceforge.net>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=subscribe>
List-Id: Linux Kernel Crash Dumps Mailing List <lkcd-general.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/lkcd-general>,
	<mailto:lkcd-general-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://www.geocrawler.com/redir-sf.php3?list=lkcd-general>
Date: Mon Dec  3 05:36:02 2001
X-Original-Date: Mon, 3 Dec 2001 14:34:55 +0100

Hi Bharata,

Can you check the value of the first argument of the function call to
kl_readmem()?
The variable paddr is passed there. It should contain (value of high_memory
- PAGE_OFFSET).
high_memory you will find in System.map.

E.g.  If your System.map contains

    c02424a8 D high_memory

and PAGE_OFFSET is 0xc0000000 then a value of 0x02424a8 should be passed to
kl_readmem() in paddr.
Maybe System.map or KL_PAGE_OFFSET does not correspond to the system you
dumped?
Or maybe your dump is incomplete ... ?

BTW: As I recently observed, the usage of libklib-error-handling is wrong -
e.g. in lcrash/main.c.
In libklib error codes are defined as numeric values. To check error
conditions klib_error is compared with several error codes.
But at some positions in the code, bitwise-or and/or bitwise-and operations
are applied to errror codes, which aren't flags but usual numeric
values. THIS CAN'T WORK!
The error message:
map.1: not found in dump file
is probably caused by such a wrong error-handling-usage.

Regards,

Andreas

--
Linux for eServer Development
Tel :  +49-7031-16-4640
Notes mail :  Andreas Herrmann/GERMANY/IBM@IBMDE
email :  aherrman@de.ibm.com



|--------+---------------------------------------->
|        |          bharata@linux.ibm.com         |
|        |          Sent by:                      |
|        |          lkcd-general-admin@lists.sourc|
|        |          eforge.net                    |
|        |                                        |
|        |                                        |
|        |          12/03/01 11:59 AM             |
|        |          Please respond to bharata     |
|        |                                        |
|-----