FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Kernel

 
 
LinkBack Thread Tools
 
Old 04-05-2010, 02:51 AM
"John David Anglin"
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

> Thanks a lot for the discussion.
>
> James Bottomley wrote:
> > So your theory is that the data the kernel sees doing the page copy can
> > be stale because of dirty cache lines in userspace (which is certainly
> > possible in the ordinary way)?
>
> Yes.
>
> > By design that shouldn't happen: the idea behind COW breaking is
> > that before it breaks, the page is read only ... this means that
> > processes can have clean cache copies of it, but never dirty cache
> > copies (because writes are forbidden).
>
> That must be design, I agree.
>
> To keep this condition (no dirty cache for COW page), we need to flush
> cache before ptep_set_wrprotect. That's my point.
>
> Please look at the code path:
> (kernel/fork.c)
> do_fork -> copy_process -> copy_mm -> dup_mm -> dup_mmap ->
> (mm/memory.c)
> copy_page_range -> copy_p*d_range -> copy_one_pte -> ptep_set_wrprotect
>
> The function flush_cache_dup_mm is called from dup_mmap, that's enough
> for a case of a process with single thread.
> I think that:
> We need to flush cache before ptep_set_wrprotect for a process with
> multiple threads. Other threads may change memory after a thread
> invokes do_fork and before calling ptep_set_wrprotect. Specifically,
> a process may sleep at pte_alloc function to get a page.

I agree. It is interesting that in the case of the Debian bug that
a thread of the parent process causes the COW break and thereby corrupts
its own memory. As far as I can tell, the fork'd child never writes
to the memory that causes the fault.

My testing indicates that your suggested change fixes the Debian
bug. I've attached below my latest test version. This seems to fix
the bug on both SMP and UP kernels.

However, it doesn't fix all page/cache related issues on parisc
SMP kernels that I commonly see.

My first inclination after even before reading your analysis was
to assume that copy_user_page was broken (i.e, that even if a
processor cache was dirty when the COW page was write protected,
it should be possible to do the flush before the page is copied).
However, this didn't seem to work... Possibly, there are issues
with aliased addresses.

I note that sparc flushes the entire cache and purges the entire
tlb in kmap_atomic/kunmap_atomic for highmem. Although the breakage
that I see is not limited to PA8800/PA8900, I'm not convinced
that we maintain coherency that is required for these processors
in copy_user_page when we have multiple threads.

As a side note, kmap_atomic/kunmap_atomic seem to lack calls to
pagefault_disable()/pagefault_enable() on PA8800.

Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)

diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
index a27d2e2..b140d5c 100644
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -14,6 +14,7 @@
#include <linux/bitops.h>
#include <asm/processor.h>
#include <asm/cache.h>
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);

/*
* kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
@@ -456,17 +457,22 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
return old_pte;
}

-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
+static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
#ifdef CONFIG_SMP
unsigned long new, old;
+#endif
+ pte_t old_pte = *ptep;
+
+ if (atomic_read(&mm->mm_users) > 1)
+ flush_cache_page(vma, addr, pte_pfn(old_pte));

+#ifdef CONFIG_SMP
do {
old = pte_val(*ptep);
new = pte_val(pte_wrprotect(__pte (old)));
} while (cmpxchg((unsigned long *) ptep, old, new) != old);
#else
- pte_t old_pte = *ptep;
set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
#endif
}
diff --git a/mm/memory.c b/mm/memory.c
index 09e4b1b..21c2916 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
* in the parent and the child
*/
if (is_cow_mapping(vm_flags)) {
- ptep_set_wrprotect(src_mm, addr, src_pte);
+ ptep_set_wrprotect(vma, src_mm, addr, src_pte);
pte = pte_wrprotect(pte);
}




--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100405025154.DE1D55160@hiauly1.hia.nrc.ca">http://lists.debian.org/20100405025154.DE1D55160@hiauly1.hia.nrc.ca
 
Old 04-05-2010, 02:58 AM
"John David Anglin"
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

> > > By design that shouldn't happen: the idea behind COW breaking is
> > > that before it breaks, the page is read only ... this means that
> > > processes can have clean cache copies of it, but never dirty cache
> > > copies (because writes are forbidden).
> >
> > That must be design, I agree.
> >
> > To keep this condition (no dirty cache for COW page), we need to flush
> > cache before ptep_set_wrprotect. That's my point.

Is it possible that a sleep/reschedule could cause the cache to become
dirty again before it is write protected?

Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100405025840.B79A54EF4@hiauly1.hia.nrc.ca">http://lists.debian.org/20100405025840.B79A54EF4@hiauly1.hia.nrc.ca
 
Old 04-05-2010, 04:18 PM
James Bottomley
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

On Sun, 2010-04-04 at 22:51 -0400, John David Anglin wrote:
> > Thanks a lot for the discussion.
> >
> > James Bottomley wrote:
> > > So your theory is that the data the kernel sees doing the page copy can
> > > be stale because of dirty cache lines in userspace (which is certainly
> > > possible in the ordinary way)?
> >
> > Yes.
> >
> > > By design that shouldn't happen: the idea behind COW breaking is
> > > that before it breaks, the page is read only ... this means that
> > > processes can have clean cache copies of it, but never dirty cache
> > > copies (because writes are forbidden).
> >
> > That must be design, I agree.
> >
> > To keep this condition (no dirty cache for COW page), we need to flush
> > cache before ptep_set_wrprotect. That's my point.
> >
> > Please look at the code path:
> > (kernel/fork.c)
> > do_fork -> copy_process -> copy_mm -> dup_mm -> dup_mmap ->
> > (mm/memory.c)
> > copy_page_range -> copy_p*d_range -> copy_one_pte -> ptep_set_wrprotect
> >
> > The function flush_cache_dup_mm is called from dup_mmap, that's enough
> > for a case of a process with single thread.
> > I think that:
> > We need to flush cache before ptep_set_wrprotect for a process with
> > multiple threads. Other threads may change memory after a thread
> > invokes do_fork and before calling ptep_set_wrprotect. Specifically,
> > a process may sleep at pte_alloc function to get a page.
>
> I agree. It is interesting that in the case of the Debian bug that
> a thread of the parent process causes the COW break and thereby corrupts
> its own memory. As far as I can tell, the fork'd child never writes
> to the memory that causes the fault.
>
> My testing indicates that your suggested change fixes the Debian
> bug. I've attached below my latest test version. This seems to fix
> the bug on both SMP and UP kernels.
>
> However, it doesn't fix all page/cache related issues on parisc
> SMP kernels that I commonly see.
>
> My first inclination after even before reading your analysis was
> to assume that copy_user_page was broken (i.e, that even if a
> processor cache was dirty when the COW page was write protected,
> it should be possible to do the flush before the page is copied).
> However, this didn't seem to work... Possibly, there are issues
> with aliased addresses.
>
> I note that sparc flushes the entire cache and purges the entire
> tlb in kmap_atomic/kunmap_atomic for highmem. Although the breakage
> that I see is not limited to PA8800/PA8900, I'm not convinced
> that we maintain coherency that is required for these processors
> in copy_user_page when we have multiple threads.
>
> As a side note, kmap_atomic/kunmap_atomic seem to lack calls to
> pagefault_disable()/pagefault_enable() on PA8800.
>
> Dave
> --
> J. David Anglin dave.anglin@nrc-cnrc.gc.ca
> National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
>
> diff --git a/arch/parisc/include/asm/pgtable.h b/arch/parisc/include/asm/pgtable.h
> index a27d2e2..b140d5c 100644
> --- a/arch/parisc/include/asm/pgtable.h
> +++ b/arch/parisc/include/asm/pgtable.h
> @@ -14,6 +14,7 @@
> #include <linux/bitops.h>
> #include <asm/processor.h>
> #include <asm/cache.h>
> +extern void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn);
>
> /*
> * kern_addr_valid(ADDR) tests if ADDR is pointing to valid kernel
> @@ -456,17 +457,22 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
> return old_pte;
> }
>
> -static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> +static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> #ifdef CONFIG_SMP
> unsigned long new, old;
> +#endif
> + pte_t old_pte = *ptep;
> +
> + if (atomic_read(&mm->mm_users) > 1)

Just to verify there's nothing this is hiding, can you make this

if (pte_dirty(old_pte))

and reverify? The if clause should only trip on the case where the
parent has dirtied the line between flush and now.

> + flush_cache_page(vma, addr, pte_pfn(old_pte));
>
> +#ifdef CONFIG_SMP
> do {
> old = pte_val(*ptep);
> new = pte_val(pte_wrprotect(__pte (old)));
> } while (cmpxchg((unsigned long *) ptep, old, new) != old);
> #else
> - pte_t old_pte = *ptep;
> set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
> #endif
> }
> diff --git a/mm/memory.c b/mm/memory.c
> index 09e4b1b..21c2916 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -616,7 +616,7 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> * in the parent and the child
> */
> if (is_cow_mapping(vm_flags)) {
> - ptep_set_wrprotect(src_mm, addr, src_pte);
> + ptep_set_wrprotect(vma, src_mm, addr, src_pte);

So this is going to be a hard sell because of the arch churn. There are,
however, three ways to do it with the original signature.

1. implement copy_user_highpage ... this allows us to copy through
the child's page cache (which is coherent with the parent's
before the cow) and thus pick up any cache changes without a
flush
2. use the mm identically to flush_user_cache_page_noncurrent. The
only reason that needs the vma is for the icache check ... but
that shouldn't happen here (if the parent is actually doing a
self modifying exec region, it needs to manage coherency
itself).
3. Flush in kmap ... this is something that's been worrying me
since the flamewars over kmap for pio.

James





--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 1270484318.2790.38.camel@mulgrave.site">http://lists.debian.org/1270484318.2790.38.camel@mulgrave.site
 
Old 04-06-2010, 04:57 AM
NIIBE Yutaka
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

John David Anglin wrote:
> It is interesting that in the case of the Debian bug that
> a thread of the parent process causes the COW break and thereby corrupts
> its own memory. As far as I can tell, the fork'd child never writes
> to the memory that causes the fault.

Thanks for writing and testing a patch.

The case of #561203 is second scenario. I think that this case is
relevant to VIVT-WB machine too (provided kernel does copy by kernel
address).

James Bottomley wrote:
> So this is going to be a hard sell because of the arch churn. There are,
> however, three ways to do it with the original signature.

Currently, I think that signature change would be inevitable for
ptep_set_wrprotect.

> 1. implement copy_user_highpage ... this allows us to copy through
> the child's page cache (which is coherent with the parent's
> before the cow) and thus pick up any cache changes without a
> flush

Let me think about this way.

Well, this would improve both cases of the first scenario of mine and
the second scenario.

But... I think that even if we would have copy_user_highpage which
does copy by user address, we need to flush at ptep_set_wrprotect. I
think that we need to keep the condition: no dirty cache for COW page.

Think about third scenario of threads and fork:

(1) In process A, there are multiple threads, and a thread A-1 invokes
fork. We have process B, with a different space identifier color.

(2) Another thread A-2 in process A runs while A-1 copies memory by
dup_mmap. A-2 writes to the address <x> in a page. Let's call
this page <oldpage>.

(3) We have dirty cache for <x> by A-2 at the time of
ptep_set_wrprotect of thread A-1. Suppose that we don't flush
here.

(4) A-1 finishes copy, and sleeps.

(5) Child process B is waken up and sees old value at <x> in <oldpage>,
through different cache line. B sleeps.

(6) A-2 is waken up. A-2 touches the memory again, breaks COW. A-2
copies data on <oldpage> to <newpage>. OK, <newpage> is
consistent with copy_user_highpage by user address.

Note that during this copy, the cache line of <x> by A-2 is
flushed out to <oldpage>. It invokes another memory fault and COW
break. (I think that this memory fault is unhealthy.)
Then, new value goes to <x> on <oldpage> (when it's physically
tagged cache).

A-2 sleeps.

(7) Child process B is waken up. When it accesses at <x>, it sees new
value suddenly.


If we flush cache to <oldpage> at ptep_set_wrprotect, this couldn't
occur.


* * *


I know that we should not do "threads and fork". It is difficult to
define clean semantics. Because another thread may touch memory while
a thread which does memory copy for fork, the memory what the child
process will see may be inconsistent. For the child, a page might be
new, while another page might be old.

For VIVT-WB cache machine, I am considering a possibility for the
child process to have inconsistent memory even within a single page
(when we have no flush at ptep_set_wrprotect).

It will be needed for me to talk to linux-arch soon or later.
--



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4BBABF23.1030405@fsij.org">http://lists.debian.org/4BBABF23.1030405@fsij.org
 
Old 04-06-2010, 01:37 PM
James Bottomley
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

On Tue, 2010-04-06 at 13:57 +0900, NIIBE Yutaka wrote:
> John David Anglin wrote:
> > It is interesting that in the case of the Debian bug that
> > a thread of the parent process causes the COW break and thereby corrupts
> > its own memory. As far as I can tell, the fork'd child never writes
> > to the memory that causes the fault.
>
> Thanks for writing and testing a patch.
>
> The case of #561203 is second scenario. I think that this case is
> relevant to VIVT-WB machine too (provided kernel does copy by kernel
> address).
>
> James Bottomley wrote:
> > So this is going to be a hard sell because of the arch churn. There are,
> > however, three ways to do it with the original signature.
>
> Currently, I think that signature change would be inevitable for
> ptep_set_wrprotect.

Well we can't do it by claiming several architectures are wrong in their
implementation. We might do it by claiming to need vma knowledge ...
however, even if you want the flush, as I said, you don't need to change
the signature.

> > 1. implement copy_user_highpage ... this allows us to copy through
> > the child's page cache (which is coherent with the parent's
> > before the cow) and thus pick up any cache changes without a
> > flush
>
> Let me think about this way.
>
> Well, this would improve both cases of the first scenario of mine and
> the second scenario.
>
> But... I think that even if we would have copy_user_highpage which
> does copy by user address, we need to flush at ptep_set_wrprotect. I
> think that we need to keep the condition: no dirty cache for COW page.
>
> Think about third scenario of threads and fork:
>
> (1) In process A, there are multiple threads, and a thread A-1 invokes
> fork. We have process B, with a different space identifier color.

I don't understand what you mean by space colour ... there's cache
colour which refers to the line in the cache to which the the physical
memory maps. The way PA is set up, space ID doesn't factor into cache
colour.

> (2) Another thread A-2 in process A runs while A-1 copies memory by
> dup_mmap. A-2 writes to the address <x> in a page. Let's call
> this page <oldpage>.
>
> (3) We have dirty cache for <x> by A-2 at the time of
> ptep_set_wrprotect of thread A-1. Suppose that we don't flush
> here.
>
> (4) A-1 finishes copy, and sleeps.
>
> (5) Child process B is waken up and sees old value at <x> in <oldpage>,
> through different cache line. B sleeps.

This isn't possible. at this point, A and B have the same virtual
address and mapping for <oldpage> this means they are the same cache
colour, so they both see the cached value.

James

> (6) A-2 is waken up. A-2 touches the memory again, breaks COW. A-2
> copies data on <oldpage> to <newpage>. OK, <newpage> is
> consistent with copy_user_highpage by user address.
>
> Note that during this copy, the cache line of <x> by A-2 is
> flushed out to <oldpage>. It invokes another memory fault and COW
> break. (I think that this memory fault is unhealthy.)
> Then, new value goes to <x> on <oldpage> (when it's physically
> tagged cache).
>
> A-2 sleeps.
>
> (7) Child process B is waken up. When it accesses at <x>, it sees new
> value suddenly.
>
>
> If we flush cache to <oldpage> at ptep_set_wrprotect, this couldn't
> occur.
>
>
> * * *
>
>
> I know that we should not do "threads and fork". It is difficult to
> define clean semantics. Because another thread may touch memory while
> a thread which does memory copy for fork, the memory what the child
> process will see may be inconsistent. For the child, a page might be
> new, while another page might be old.
>
> For VIVT-WB cache machine, I am considering a possibility for the
> child process to have inconsistent memory even within a single page
> (when we have no flush at ptep_set_wrprotect).
>
> It will be needed for me to talk to linux-arch soon or later.





--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 1270561069.4493.29.camel@mulgrave.site">http://lists.debian.org/1270561069.4493.29.camel@mulgrave.site
 
Old 04-06-2010, 01:44 PM
James Bottomley
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

On Tue, 2010-04-06 at 08:37 -0500, James Bottomley wrote:
> > (5) Child process B is waken up and sees old value at <x> in
> <oldpage>,
> > through different cache line. B sleeps.
>
> This isn't possible. at this point, A and B have the same virtual
> address and mapping for <oldpage> this means they are the same cache
> colour, so they both see the cached value.

Perhaps to add more detail to this. In spite of what the arch manual
says (it says the congruence stride is 16MB), the congruence stride on
all manufactured parisc processors is 4MB. This means that any virtual
addresses, regardless of space id, that are equal modulo 4MB have the
same cache colour.

James





--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 1270561481.4493.40.camel@mulgrave.site">http://lists.debian.org/1270561481.4493.40.camel@mulgrave.site
 
Old 04-08-2010, 09:11 PM
Helge Deller
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

On 04/02/2010 09:35 PM, John David Anglin wrote:
> On Fri, 02 Apr 2010, NIIBE Yutaka wrote:
>
>> NIIBE Yutaka wrote:
>>> To have same semantics as other archs, I think that VIPT-WB cache
>>> machine should have cache flush at ptep_set_wrprotect, so that memory
>>> of the page has up-to-date data. Yes, it will be huge performance
>>> impact for fork. But I don't find any good solution other than this
>>> yet.
>>
>> I think we could do something like (only for VIPT-WB cache machine):
>>
>> - static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long
>> address, pte_t *ptep)
>>
>> + static inline void ptep_set_wrprotect(struct vm_area_struct *vma, struct
>> mm_struct *mm, unsigned long addr, pte_t *ptep)
>> {
>> pte_t old_pte = *ptep;
>> + if (atomic_read(&mm->mm_users) > 1)
>> + flush_cache_page(vma, addr, pte_pfn(old_pte));
>> set_pte_at(mm, addr, ptep, pte_wrprotect(old_pte));
>> }
>
> I tested the hack below on two machines currently running 2.6.33.2
> UP kernels. The change seems to fix Debian #561203 (minifail bug)!
> Thus, I definitely think you are on the right track. I'll continue
> to test.
>
> I suspect the same issue is present for SMP kernels.

Hi Dave,

I tested your patch today on one of my machines with plain kernel 2.6.33 (32bit, SMP, B2000 I think).
Sadly I still did see the minifail bug.

Are you sure, that the patch fixed this bug for you?

Helge

do_page_fault() pid=21470 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=7986 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=19952 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=13549 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=21862 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=4615 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=17336 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=21986 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=2157 command='minifail3' type=15 address=0x000000dc
do_page_fault() pid=23886 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=2681 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3229 command='minifail3' type=15 address=0x000000ec
do_page_fault() pid=26095 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=20722 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=19912 command='minifail3' type=15 address=0x000000ec
...
pagealloc: memory corruption
7db0c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
7db0c790: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
7db0c7a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
7db0c7b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Backtrace:
[<1011ec14>] show_stack+0x18/0x28
[<10117ba0>] dump_stack+0x1c/0x2c
[<101c6594>] kernel_map_pages+0x2a0/0x2b8
[<1019e6c8>] get_page_from_freelist+0x3d4/0x614
[<1019ea3c>] __alloc_pages_nodemask+0x134/0x610
[<101b1d20>] do_wp_page+0x268/0xac0
[<101b3b34>] handle_mm_fault+0x4d4/0x7c4
[<1011d854>] do_page_fault+0x1f8/0x2fc
[<1011f450>] handle_interruption+0xec/0x730
[<10103078>] intr_check_sig+0x0/0x34
...
do_page_fault() pid=13414 command='minifail3' type=15 address=0x000000dc
do_page_fault() pid=22776 command='minifail3' type=15 address=0x00000000
do_page_fault() pid=26290 command='minifail3' type=15 address=0x000000ec
do_page_fault() pid=1399 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=16130 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=26401 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3383 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3400 command='minifail3' type=15 address=0x00000004
do_page_fault() pid=18659 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=3730 command='minifail3' type=6 address=0x00000003
do_page_fault() pid=28828 command='minifail3' type=6 address=0x00000003



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4BBE468D.5060609@gmx.de">http://lists.debian.org/4BBE468D.5060609@gmx.de
 
Old 04-08-2010, 10:44 PM
"John David Anglin"
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

> On Thu, 08 Apr 2010, Helge Deller wrote:

> > I tested your patch today on one of my machines with plain kernel 2.6.33 (32bit, SMP, B2000 I think).
> > Sadly I still did see the minifail bug.
> >
> > Are you sure, that the patch fixed this bug for you?
>
> Seemed to, but I have a bunch of other changes installed. Possibly,
> the change to cacheflush.h is important. It affects all PA8000.

I also think the change suggested by James

+ if (pte_dirty(old_pte))

is important for SMP. With the patch set that I sent, my rp3440 and
gsyprf11 seem reasonably stable running 2.6.33.2 SMP. I doubt all
problems are solved but things are a lot better than before.

Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100408224446.96F294FA3@hiauly1.hia.nrc.ca">http://lists.debian.org/20100408224446.96F294FA3@hiauly1.hia.nrc.ca
 
Old 06-02-2010, 03:33 PM
Modestas Vainius
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

Hello,

this bug [1] is back to the "very common" department with eglibc 2.11 (libc6-
dev_2.11.1-1) builds. Majority of KDE applications are failing to build on
hppa again. Is there really nothing what could be done to fix it?

1. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=561203
2.
https://buildd.debian.org/fetch.cgi?pkg=kde4libs;ver=4%3A4.4.4-1;arch=hppa;stamp=1275467025
3.
https://buildd.debian.org/fetch.cgi?pkg=basket;ver=1.80-1;arch=hppa;stamp=1275483241

--
Modestas Vainius <modestas@vainius.eu>
 
Old 06-02-2010, 05:16 PM
John David Anglin
 
Default Bug#561203: threads and fork on machine with VIPT-WB cache

On Wed, 02 Jun 2010, Modestas Vainius wrote:

> Hello,
>
> this bug [1] is back to the "very common" department with eglibc 2.11 (libc6-
> dev_2.11.1-1) builds. Majority of KDE applications are failing to build on
> hppa again. Is there really nothing what could be done to fix it?

I will just say it is very tricky. I think a fix is possible (arm and mips
had similar cache problems) but the victim replacement present in PA8800/PA8900
caches makes the problem especially difficult for hardware using these
processors.

I have spent the last few months testing various alternatives and have
now done hundreds of kernel builds. I did post some experimental patches
that fix the problem on UP kernels. However, the problem is not resolved
for SMP kernels.

The minifail test is a good one to demonstrate the problem. Indeed,
a very similar test was given in the thread below:
http://readlist.com/lists/vger.kernel.org/linux-kernel/54/270861.html

This thread also discusses the PA8800 problem:
http://readlist.com/lists/vger.kernel.org/linux-kernel/54/271417.html

I currently surmise that we have a problem with the cache victim
replacement, although the cause isn't clear. I did find recently
that the cache prefetch in copy_user_page_asm extends to the line
beyond the end of the page, but fixing this doesn't resolve the problem.

I am still experimenting with using equivalent aliasing. It does
help to flush in ptep_set_wrprotect.

Dave
--
J. David Anglin dave.anglin@nrc-cnrc.gc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6602)



--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100602171600.GA5566@hiauly1.hia.nrc.ca">http://lists.debian.org/20100602171600.GA5566@hiauly1.hia.nrc.ca
 

Thread Tools




All times are GMT. The time now is 11:37 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org