Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Crash Utility (http://www.linux-archive.org/crash-utility/)
-   -   RFC: string search for crash (http://www.linux-archive.org/crash-utility/484077-rfc-string-search-crash.html)

Bob Montgomery 02-01-2011 11:12 PM

RFC: string search for crash
 
Here is a patch to add string search to crash-5.1.1.

It requires the my previous patch for the parse_line routine.

It searches for the specified strings, and for half or more of the
string appearing at the start and end of search blocks (usually pages)
in case the string spans a page boundary.

It is currently invoked with the -c option to search, as in:

crash-5.1.1str> search -k -e 0xffffc90000000000 -c "getty[3895]"
ffff880123b54810: :46 getty[3895]: /dev/ttyS1: No such file or dir

or

crash-5.1.1str> search -k -c -e 0xffffc90000000000 "getty[3895]"
ffff880123b54810: :46 getty[3895]: /dev/ttyS1: No such file or dir

It reports the found string in 48 chars of context (except
at the end of pages), and it reports aligned addresses, so the found
string doesn't always appear at the beginning of the context (as in the
example above).

It could optionally use strncasecmp to do case-insensitive searches.

I simplified it :-) by combining the main and tail searches into one
loop and added a 10-15% performance degradation somewhere.

Here it searches the dump for bugs and danger :-)

crash-5.1.1str> search -k -c -e 0xffffc90000000000 "bugs" "danger"
ffff8801254c0ff8: ug:Debug
ffff880125cd4ff8: ebian/bu
ffff880125ec8870: efb: danger danger! Oopsen imminent!..<6>Mode (%
ffff880125ec8878: ger danger! Oopsen imminent!..<6>Mode (%dx%d) la
ffff880125eee518: ofb: danger danger! Oopsen imminent!..<3>neofb:
ffff880125eee520: ger danger! Oopsen imminent!..<3>neofb: neo2200
ffff880125f72000: ger_event.......................................
ffff880125fc3560: ice bugs (default = 0, 128kB max transfer = 0x1,

The first two hits come because half or more of "bugs" occurred at the
end of a page. The next to the last hit is found because the last
half of "danger" appears at the beginning of a page.

Bob Montgomery




--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

Dave Anderson 02-02-2011 01:49 PM

RFC: string search for crash
 
----- Original Message -----
> Here is a patch to add string search to crash-5.1.1.
>
> It requires the my previous patch for the parse_line routine.
>
> It searches for the specified strings, and for half or more of the
> string appearing at the start and end of search blocks (usually pages)
> in case the string spans a page boundary.
>
> It is currently invoked with the -c option to search, as in:
>
> crash-5.1.1str> search -k -e 0xffffc90000000000 -c "getty[3895]"
> ffff880123b54810: :46 getty[3895]: /dev/ttyS1: No such file or dir
>
> or
>
> crash-5.1.1str> search -k -c -e 0xffffc90000000000 "getty[3895]"
> ffff880123b54810: :46 getty[3895]: /dev/ttyS1: No such file or dir
>
> It reports the found string in 48 chars of context (except
> at the end of pages), and it reports aligned addresses, so the found
> string doesn't always appear at the beginning of the context (as in the
> example above).
>
> It could optionally use strncasecmp to do case-insensitive searches.
>
> I simplified it :-) by combining the main and tail searches into one
> loop and added a 10-15% performance degradation somewhere.
>
> Here it searches the dump for bugs and danger :-)
>
> crash-5.1.1str> search -k -c -e 0xffffc90000000000 "bugs" "danger"
> ffff8801254c0ff8: ug:Debug
> ffff880125cd4ff8: ebian/bu
> ffff880125ec8870: efb: danger danger! Oopsen imminent!..<6>Mode (%
> ffff880125ec8878: ger danger! Oopsen imminent!..<6>Mode (%dx%d) la
> ffff880125eee518: ofb: danger danger! Oopsen imminent!..<3>neofb:
> ffff880125eee520: ger danger! Oopsen imminent!..<3>neofb: neo2200
> ffff880125f72000: ger_event.......................................
> ffff880125fc3560: ice bugs (default = 0, 128kB max transfer = 0x1,
>
> The first two hits come because half or more of "bugs" occurred at the
> end of a page. The next to the last hit is found because the last
> half of "danger" appears at the beginning of a page.
>
> Bob Montgomery

Hi Bob,

A couple things...

First, this is a really nifty feature...

Second, I appreciate that you created a new string_search() function
instead of attempting to merge it into the existing search() function.
I'm doing the same thing for implementing the "-p" functionality because
physical addresses may require 64-bit start/end addresses on 32-bit machines,
However, I did have to change cmd_search() to use ulonglong's for start
and end. Then trying to shoe-horn physical memory searching into
the existing virtual-memory-presuming search() command was way too ugly
to consider.

Third, given that you're searching for specific string, why not show
the actual byte-aligned address as the starting point, and then just
display the string contents until the next non-ASCII character, or some
other delimiter like a 48 byte limit or whatever?

I understand that the page-crossing issue is a PITA, but with respect
to a user searching memory for strings, the page-crossing issue is
seemingly irrelevant. In other words, why this:

> ffff880125f72000: ger_event.......................................

instead of displaying something like this:

ffff880125f71ffd: danger_event

So I guess I'm just wondering why the "dan" at the end of the page
cannot be displayed? It just doesn't seem user-friendly to force
the user to understand the half-of-the-string-at-a-page-boundary
business. Since you do recognize that string exists and crosses
a page boundary, shouldn't you be able to display the first part?

Lastly, I'm still planning to add the remaining "search" command
updates for the -KVM flags and the "missing" x86_64 segment bug,
and so I may need you to re-work this patch to fit into an updated
version of search(). I've been delayed getting crash-5.1.2 out
the door because of all of the recent patch postings, and it may
be worth waiting to get this piece in until 5.1.3 -- is that OK
with you?

Thanks,
Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

Dave Anderson 02-02-2011 07:28 PM

RFC: string search for crash
 
----- Original Message -----
> On Wed, 2011-02-02 at 14:49 +0000, Dave Anderson wrote:
> >
> > ----- Original Message -----
> > > Here is a patch to add string search to crash-5.1.1.
> > >
> > > It requires the my previous patch for the parse_line routine.
> > >
> > > It searches for the specified strings, and for half or more of the
> > > string appearing at the start and end of search blocks (usually pages)
> > > in case the string spans a page boundary.
> > >
>
> > First, this is a really nifty feature...
>
> Thanks.
>
> > Second, I appreciate that you created a new string_search() function
> > instead of attempting to merge it into the existing search()
> > function.
>
> Way too ugly otherwise.
>
> >
> > Third, given that you're searching for specific string, why not show
> > the actual byte-aligned address as the starting point, and then just
> > display the string contents until the next non-ASCII character, or some
> > other delimiter like a 48 byte limit or whatever?
>
> That's actually the easier thing to do and just a preference decision.
> I like looking at aligned addresses for using with subsequent rd
> commands. And it's also sometimes useful to see some pre-string context.
> But rounding to the previous long aligned address is really random with
> respect to pre-string context. There's probably a better way to provide
> that.

Actually, rounding back down to the previous long-aligned address might
be reasonable.

>
> > I understand that the page-crossing issue is a PITA, but with respect
> > to a user searching memory for strings, the page-crossing issue is
> > seemingly irrelevant. In other words, why this:
> >
> > > ffff880125f72000: ger_event.......................................
> >
> > instead of displaying something like this:
> >
> > ffff880125f71ffd: danger_event
> >
> > So I guess I'm just wondering why the "dan" at the end of the page
> > cannot be displayed? It just doesn't seem user-friendly to force
> > the user to understand the half-of-the-string-at-a-page-boundary
> > business. Since you do recognize that string exists and crosses
> > a page boundary, shouldn't you be able to display the first part?
>
> I started working on my search stuff directly on dumpfiles, which meant
> I was searching through a collection of physical pages. When searching
> physical pages, there is no reason to believe that the end of page N has
> anything logically to do with the beginning of page N+1. So looking
> across to the next page for the continuation of a string didn't seem
> strictly correct or useful. (This was never even a consideration when
> searching for a long or int; they don't cross page boundaries.)

Right, certainly it doesn't make sense with physical pages, and to a
somewhat lesser extent, with unity-mapped pages. But with user virtual,
the mapped kernel virtual region (x86_64 and ia64), and vmalloc addresses,
strings would cross page boundaries. So I wonder whether it's worth the
effort to do something like I suggest since you always know what kind of
memory is being searched? I know, I know, it's easy for *me* to say...

But that leads to another issue. As I mentioned before, the -p
implementation I've got uses its own function, so to marry that
with the string search capability kind of mucks with your patch.
I'm wondering whether your string search function could be called
from both the virtual and physical search functions near the bottom
where the page is searched? Just something to think about.

>
> If I'm searching by virtual address, then N+1 does follow N, but I don't
> know that at the time I'm working on page N. The current search
> strategy is "choose a page, load the page, search the page". So if I
> find a promising first part of a string at the end of a page, the search
> loop would need a memory to carry my position in the string(s) across to
> the next page, if I can determine that the next page follows
> contiguously in the current search type. Not impossible, but not
> trivial, I think. Possibly easier to pre-read the following page to the
> length of the longest search string onto the end of an extended page
> buffer and then search for the full string to the end of that extended
> page buffer before loading the next page (and part of the one after
> that). But you still have to deal with the possiblity that the "next"
> page isn't contiguous to the one you're working on.

Right, perhaps the "last-page-contents-and-its-address" could always be saved?
And you could always attempt to read the next page if you see the beginning
of a search string.

>
> When you called out the example above, a first-of-page hit on
> "ger_event" when searching for the string "danger", my first thought was
> that you had found a bug. The "dan" at the end of the previous page
> should also have met the criteria for half of the string "danger", and
> showed up as an end-of-page hit. But (whew!), here's what's in memory
> there:
>
> crash-5.1.1str> rd ffff880125f71ff0 8
> ffff880125f71ff0: 000000000ed7ab73 676972745f64656c s.......led_trig
> ffff880125f72000: 6e6576655f726567 0000000000000074 ger_event.......
> ffff880125f72010: 0000000000000000 0000000000000000 ................
> ffff880125f72020: 0000000000000000 0000000000000000 ................
>
> Which serves as a reminder that the "part of a string" match is just
> there to suggest possibilities that might turn out to be false
> positives.
>
> > Lastly, I'm still planning to add the remaining "search" command
> > updates for the -KVM flags and the "missing" x86_64 segment bug,
> > and so I may need you to re-work this patch to fit into an updated
> > version of search(). I've been delayed getting crash-5.1.2 out
> > the door because of all of the recent patch postings, and it may
> > be worth waiting to get this piece in until 5.1.3 -- is that OK
> > with you?
>
> No rush.

OK good.

Thanks,
Dave


--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility

Dave Anderson 02-10-2011 07:29 PM

RFC: string search for crash
 
Hi Bob,

So 5.1.2 is finally available for you to merge in your new string
search capability...

As it turns out, the changes made for fixing and correctly implementing
the search command has made things look quite a bit different. As I
mentioned before, there are now two search functions, search_virtual()
and search_physical(). The major change was to the next_kpage() function,
where I wanted to avoid hacking in a bunch of "if (machine_type(XXX))"
sections. So taking your original suggestion, I created a helper
machine-dependent function called via machdep->get_kvaddr_ranges().
It returns an array of kernel virtual address ranges and their type,
sorted by their starting kernel virtual address. At the bottom of
cmd_search(), search_virtual() gets called repeatedly for each type
of kernel virtual addresses if it's appropriate for the user's request.
There are 5 possible kernel virtual range types:

KVADDR_UNITY_MAP
KVADDR_VMALLOC
KVADDR_VMEMMAP (possible on x86_64, ia64, ppc64 and s390x)
KVADDR_START_MAP (ia64 and x86_64 only)
KVADDR_MODULES (x86_64 only if modules are not in vmalloc() vmlist)

x86_64, ia64, ppc64 and s390c implement their own machine-dependent
xxxx_get_kvaddr_ranges() function; all of the other architectures
use generic_get_kvaddr_ranges(), which returns KVADDR_UNITY_MAP and
KVADDR_VMALLOC ranges.

If you "set debug 1" before the search, you can see what the
range values and types are, and then each call to search_virtual()
displays the start/end address, does the search, and then
indicates how many pages were checked, and of those, how many
were actually read -- like here from a 2.6.37 dumpfile:

crash> set debug 1
debug: 1
crash> search -k deadbeef
kvaddr ranges:
[0] ffff880000000000 ffff880040000000 KVADDR_UNITY_MAP
[1] ffffc90000000000 ffffffffa051c000 KVADDR_VMALLOC
[2] ffffea0000000000 ffffea0000e00000 KVADDR_VMEMMAP
[3] ffffffff80000000 ffffffff8202f000 KVADDR_START_MAP
search_virtual: start: ffff880000000000 end: ffff880040000000
ffff880008703438: deadbeef
ffff880016ffd730: deadbeef
ffff88002cf9e580: deadbeef
ffff8800320f9170: deadbeef
ffff8800327dc580: deadbeef
ffff880032ded170: deadbeef
ffff880033191170: deadbeef
ffff88003f504580: deadbeef
search_virtual: read 262128 (99%) of 262144 pages checked in 36 seconds
search_virtual: start: ffffc90000000000 end: ffffffffa051c000
search_virtual: read 3995 (5%) of 70924 pages checked in 2 seconds
search_virtual: start: ffffea0000000000 end: ffffea0000e00000
search_virtual: read 3584 (100%) of 3584 pages checked in 0 seconds
search_virtual: start: ffffffff80000000 end: ffffffff8202f000
search_virtual: read 8223 (99%) of 8239 pages checked in 3 seconds
crash>

I did implement the -K and -V flags, but unlike your original problem,
the vmalloc range search is several orders of magnitude shorter
in time consumed. The problem was that next_vmlist_vaddr() was
repeatedly calling dump_vmlist() to get the vm_struct list; I
changed it to call it only once for a dumpfile, and once-per-command
invocation on a live system. That being the case, the -V flag
is far more useful than -K, because it will not check unity-mapped
memory -- say for example, if you want to check the virtual mem_map
range for pages pointing to a particular address_space mapping.

In any case, getting back to your string search option. I'd
prefer it if you could continue to make the function separate,
and because there are now search_physical() and search_virtual()
functions, your string-search function would have to be called
from the bottom of their respective page-cycling loops. In other
words, please don't try to merge the string-search code with
the existing value-search code. I'm pretty sure you wouldn't
want to anyway.

Then, getting back to the original discussion as to how to handle
strings that cross page boundaries. I think that now I am of the
opinion that you shouldn't do anything special for physical or
unity-mapped memory. I understand that they don't necessarily
(actually probably will not) be contiguous in actual use -- but
they very well might be. Consider huge-pages, or page allocations
that are order-1 or larger. And even if they are not contiguous,
what's the harm of displaying a cross-page string match if you
find one? I would think that would be something of interest
rather than something to avoid.

That being the case, you wouldn't have to do any kind of special
handling for the various page types. If a string crosses a page
boundary -- then show it dammit! ;-)

It would seem that if even the very last character of a page
matches the beginning of a string, you could save that information,
(or the whole page), and upon checking the next *contiguous* page,
you could initially check for a cross-page string. Of course the
"multiple-search-argument" capability makes it a little bit trickier,
but it still seems doable.

And as far as the round-down to a word boundary issue, now I'm not
convinced that that is really necessary. You had mentioned concerns
about using "rd", but you can pass it any address and it will start
displaying at that address:

crash> rd ffffffff80274020 5
ffffffff80274020: 65762078756e694c 2e32206e6f697372 Linux version 2.
ffffffff80274030: 3832312d38312e36 6f6d2820356c652e 6.18-128.el5 (mo
ffffffff80274040: 40646c6975626b63 ckbuild@
crash> rd ffffffff80274021 5
ffffffff80274021: 7265762078756e69 362e32206e6f6973 inux version 2.6
ffffffff80274031: 2e3832312d38312e 636f6d2820356c65 .18-128.el5 (moc
ffffffff80274041: 6840646c6975626b kbuild@h
crash>

You probably had some other concern?

Anyway, have at it. And thanks again very much for persuing this to begin
with, as it really was time to overhaul the search command. The days
of a simple unity-mapped region followed by a small vmalloc range are
over.

I look forward to your next patch...

Thanks,
Dave

--
Crash-utility mailing list
Crash-utility@redhat.com
https://www.redhat.com/mailman/listinfo/crash-utility


All times are GMT. The time now is 10:08 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.