FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Ubuntu Development

 
 
LinkBack Thread Tools
 
Old 11-04-2009, 10:08 PM
Mario Limonciello
 
Default -nvidia upgrade issues

Hi Bryce:

I've got a couple of comments i'll echo here

On Wed, Nov 4, 2009 at 16:26, Bryce Harrington <bryce@canonical.com> wrote:

I've been looking into some problems people have been reporting

upgrading to Karmic with -nvidia installed.



One thing I've noticed is aside from whatever issue is occuring with

nvidia, there are bugs elsewhere which are compounding the problems and

leading to some poor user experiences. *A common scenario occurs if for

whatever reason the -nvidia kernel module fails to build in DKMS:

It would be very good to try to get a sampling of why the kernel modules are failing to build.* Can you try to get people to collect the failed make.log's in these scenarios?

*



438398 - If DKMS fails to build the kernel module, the package upgrade

does not kick out. *It shows package upgrade as successful. *So this

leads directly to...







So
the problem with declaring the package as failed if the DKMS build
failed is that it may actually pass or fail depending on how far along
into the updates you are.


Say you are updating to a new linux-headers with a new ABI at the same time as installing the NVIDIA package.


Well if the NVIDIA package is processed first, the headers aren't
yet installed, so the package will fail during postinst, but as soon as
the headers are loaded, the kernel postinst runs and the modules get
successfully built.


Perhaps a potential solution is to look into whether the headers are yet available for this kernel, and if they aren't don't let the DKMS build fail cause the postinst to fail, but in any other scenario let the postinst fail.

*
451305 - Jockey misses that the driver failed to build, and so is not

letting users know about the potential problem. *It goes ahead and

updates xorg.conf as if the driver was there. *X tries to obey the

configuration settings, but of course they won't work, so it exits on

startup with an error message. **Normally* bulletproof-X would kick in

at this point, display the error to the user, and give them some tools

to diagnose and/or debug the situation. *Unfortunately...

I see three potential improvements to Jockey for this scenario.

Have Jockey be able to work in an interactive frontend.* If the package install behavior is modified to query if the headers are yet available, then you can more nicely present this information to the user
Have Jockey check for the headers for the current kernel before even starting to install the packages.Before modifying the xorg.conf, do the equivalent of a modinfo nvidia to determine if the nvidia kernel module is indeed created.* Show a warning/error otherwise.

*


474806 - The new gdm no longer supports the FailsafeXServer option, so

the diagnostic session no longer can be triggered to come up. *Instead,

gdm tries several times, then gives up, but then...



441638 - The gdm upstart job notices gdm has failed and so restarts it.

X of course continues to fail, gdm tries a few times and continues to

fail, repeat ad infinitum, and the user is just left looking at a

flashing screen. *Ick.


This has been a pet peeve of mine too, so i'm glad to see a karmic-updates milestoned task on this bug.
*




The above appears to be a pretty common scenario that we're getting a

rash of bug reports about. *It's hard to be certain because many of the

bug reports are only including information about the failed boot, not on

the failed build. *So I'm not sure if it is just one reason why the

build fails, or several. *However if we can solve the above bugs it

should give much better visibility into things.





Btw, workaround for anyone experiencing this issue is to purge your

nvidia (and fglrx) packages, remove /etc/X11/xorg.conf, and reinstall

nvidia (or fglrx). *It appears that in most of the bug reports this gets

the system functioning again. *Doing a full reinstall of Ubuntu rather

than an upgrade also appears to work around the issues.



Bryce



--

ubuntu-devel mailing list

ubuntu-devel@lists.ubuntu.com

Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel



--
Mario Limonciello
superm1@gmail.com
Sent from Manchester, New Hampshire, United States
--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-04-2009, 11:31 PM
Bryce Harrington
 
Default -nvidia upgrade issues

On Wed, Nov 04, 2009 at 05:08:17PM -0600, Mario Limonciello wrote:
> Hi Bryce:
>
> I've got a couple of comments i'll echo here
>
> On Wed, Nov 4, 2009 at 16:26, Bryce Harrington <bryce@canonical.com> wrote:
>
> > I've been looking into some problems people have been reporting
> > upgrading to Karmic with -nvidia installed.
> >
> > One thing I've noticed is aside from whatever issue is occuring with
> > nvidia, there are bugs elsewhere which are compounding the problems and
> > leading to some poor user experiences. A common scenario occurs if for
> > whatever reason the -nvidia kernel module fails to build in DKMS:
> >
>
> It would be very good to try to get a sampling of why the kernel modules are
> failing to build. Can you try to get people to collect the failed
> make.log's in these scenarios?

Sure. Maybe we also need to update ubuntu-bug to automatically attach
those files for nvidia bugs. Let me know if there are any other files
that are useful for debugging -nvidia or dkms issues and I'll add them
in as well.

> > 438398 - If DKMS fails to build the kernel module, the package upgrade
> > does not kick out. It shows package upgrade as successful. So this
> > leads directly to...
> >
> >
> So the problem with declaring the package as failed if the DKMS build failed
> is that it may actually pass or fail depending on how far along into the
> updates you are.
>
> Say you are updating to a new linux-headers with a new ABI at the same time
> as installing the NVIDIA package.
>
> Well if the NVIDIA package is processed first, the headers aren't yet
> installed, so the package will fail during postinst, but as soon as the
> headers are loaded, the kernel postinst runs and the modules get
> successfully built.
> Perhaps a potential solution is to look into whether the headers are yet
> available for this kernel, and if they aren't don't let the DKMS build fail
> cause the postinst to fail, but in any other scenario let the postinst fail.

*Nod* Also there is at least one bug report where it is claimed dkms was
doing its thing while gdm was starting up, and since the module hadn't
finished building, boom. Bug 453365.

> > 451305 - Jockey misses that the driver failed to build, and so is not
> > letting users know about the potential problem. It goes ahead and
> > updates xorg.conf as if the driver was there. X tries to obey the
> > configuration settings, but of course they won't work, so it exits on
> > startup with an error message. *Normally* bulletproof-X would kick in
> > at this point, display the error to the user, and give them some tools
> > to diagnose and/or debug the situation. Unfortunately...
> >
>
> I see three potential improvements to Jockey for this scenario.
>
>
> 1. Have Jockey be able to work in an interactive frontend. If the
> package install behavior is modified to query if the headers are yet
> available, then you can more nicely present this information to the user
> 2. Have Jockey check for the headers for the current kernel before even
> starting to install the packages.
> 3. Before modifying the xorg.conf, do the equivalent of a modinfo nvidia
> to determine if the nvidia kernel module is indeed created. Show a
> warning/error otherwise.

Agreed. All three would be worth having, I would prioritize #3 since it
sounds like it would require the least code change and may be quickest
to get an SRU on. Pitti, opinions?

> > 474806 - The new gdm no longer supports the FailsafeXServer option, so
> > the diagnostic session no longer can be triggered to come up. Instead,
> > gdm tries several times, then gives up, but then...
> >
> > 441638 - The gdm upstart job notices gdm has failed and so restarts it.
> > X of course continues to fail, gdm tries a few times and continues to
> > fail, repeat ad infinitum, and the user is just left looking at a
> > flashing screen. Ick.
> >
> > This has been a pet peeve of mine too, so i'm glad to see a karmic-updates
> milestoned task on this bug.

Yeah, I brought this one up pre-release but I guess too late to solve it
before the release was finalized. I hope we can see an SRU on it soon.

Bryce

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-04-2009, 11:50 PM
Bryce Harrington
 
Default -nvidia upgrade issues

On Wed, Nov 04, 2009 at 05:08:17PM -0600, Mario Limonciello wrote:
> Hi Bryce:
>
> I've got a couple of comments i'll echo here
>
> On Wed, Nov 4, 2009 at 16:26, Bryce Harrington <bryce@canonical.com> wrote:
>
> > I've been looking into some problems people have been reporting
> > upgrading to Karmic with -nvidia installed.
> >
> > One thing I've noticed is aside from whatever issue is occuring with
> > nvidia, there are bugs elsewhere which are compounding the problems and
> > leading to some poor user experiences. A common scenario occurs if for
> > whatever reason the -nvidia kernel module fails to build in DKMS:
> >
>
> It would be very good to try to get a sampling of why the kernel modules are
> failing to build. Can you try to get people to collect the failed
> make.log's in these scenarios?

Bug 450238 adds some further information as to what might be going
wrong:

"""
Adding Module to DKMS build system
+ dkms add -m nvidia -v

Error! Invalid number of arguments passed.
Usage: add -m <module> -v <module-version>

The reason for this is, that the script uses a variable $CVERSION that
is never defined. Adding it manually works:

dkms add -m nvidia -v 185.18.36

Creating symlink /var/lib/dkms/nvidia/185.18.36/source ->
/usr/src/nvidia-185.18.36

DKMS: add Completed.
"""

Bryce

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 12:05 AM
Bryce Harrington
 
Default -nvidia upgrade issues

On Wed, Nov 04, 2009 at 04:50:57PM -0800, Bryce Harrington wrote:
> On Wed, Nov 04, 2009 at 05:08:17PM -0600, Mario Limonciello wrote:
> > Hi Bryce:
> >
> > I've got a couple of comments i'll echo here
> >
> > On Wed, Nov 4, 2009 at 16:26, Bryce Harrington <bryce@canonical.com> wrote:
> >
> > > I've been looking into some problems people have been reporting
> > > upgrading to Karmic with -nvidia installed.
> > >
> > > One thing I've noticed is aside from whatever issue is occuring with
> > > nvidia, there are bugs elsewhere which are compounding the problems and
> > > leading to some poor user experiences. A common scenario occurs if for
> > > whatever reason the -nvidia kernel module fails to build in DKMS:
> > >
> >
> > It would be very good to try to get a sampling of why the kernel modules are
> > failing to build. Can you try to get people to collect the failed
> > make.log's in these scenarios?
>
> Bug 450238 adds some further information as to what might be going
> wrong:

Sorry false alarm, this simply appears to be a dupe of a bug you already
fixed in the package a few weeks ago.

> """
> Adding Module to DKMS build system
> + dkms add -m nvidia -v
>
> Error! Invalid number of arguments passed.
> Usage: add -m <module> -v <module-version>
>
> The reason for this is, that the script uses a variable $CVERSION that
> is never defined. Adding it manually works:
>
> dkms add -m nvidia -v 185.18.36
>
> Creating symlink /var/lib/dkms/nvidia/185.18.36/source ->
> /usr/src/nvidia-185.18.36
>
> DKMS: add Completed.
> """
>
> Bryce
>
> --
> ubuntu-devel mailing list
> ubuntu-devel@lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 02:51 AM
Steve Langasek
 
Default -nvidia upgrade issues

On Wed, Nov 04, 2009 at 02:26:56PM -0800, Bryce Harrington wrote:
> 474806 - The new gdm no longer supports the FailsafeXServer option, so
> the diagnostic session no longer can be triggered to come up. Instead,
> gdm tries several times, then gives up, but then...

> 441638 - The gdm upstart job notices gdm has failed and so restarts it.
> X of course continues to fail, gdm tries a few times and continues to
> fail, repeat ad infinitum, and the user is just left looking at a
> flashing screen. Ick.

Fixes for both of these are now in the karmic-proposed queue.

Cheers,
--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
slangasek@ubuntu.com vorlon@debian.org
--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 06:44 AM
Bryce Harrington
 
Default -nvidia upgrade issues

On Wed, Nov 04, 2009 at 05:08:17PM -0600, Mario Limonciello wrote:
> Hi Bryce:
>
> I've got a couple of comments i'll echo here
>
> On Wed, Nov 4, 2009 at 16:26, Bryce Harrington <bryce@canonical.com> wrote:
>
> > I've been looking into some problems people have been reporting
> > upgrading to Karmic with -nvidia installed.
> >
> > One thing I've noticed is aside from whatever issue is occuring with
> > nvidia, there are bugs elsewhere which are compounding the problems and
> > leading to some poor user experiences. A common scenario occurs if for
> > whatever reason the -nvidia kernel module fails to build in DKMS:
> >
>
> It would be very good to try to get a sampling of why the kernel modules are
> failing to build. Can you try to get people to collect the failed
> make.log's in these scenarios?

Btw, in poking around in dkms.conf I noticed this:

PACKAGE_VERSION="185.18.31"

Shouldn't that be 185.18.36? Or am I misunderstanding the purpose of
this file?

Bryce


--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 08:43 AM
Martin Pitt
 
Default -nvidia upgrade issues

Mario Limonciello [2009-11-04 17:08 -0600]:
> I see three potential improvements to Jockey for this scenario.
>
> 1. Have Jockey be able to work in an interactive frontend. If the
> package install behavior is modified to query if the headers are yet
> available, then you can more nicely present this information to the user

What do you mean by "interactive frontend"? For debconf you mean?
I'm afraid that requires a rewrite of Jockey, since it's currently
frontend <-> dbus <-> backend <-> python-apt, so the backend doesn't
have X access. I'm afraid this isn't SRUable.

> 2. Have Jockey check for the headers for the current kernel before even
> starting to install the packages.
> 3. Before modifying the xorg.conf, do the equivalent of a modinfo nvidia
> to determine if the nvidia kernel module is indeed created. Show a
> warning/error otherwise.

Those make a lot of sense. I'll see to fixing those ASAP and SRU them.

Thanks,

Martin

--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)
--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 10:44 AM
Amit Kucheria
 
Default -nvidia upgrade issues

On Thu, Nov 5, 2009 at 12:26 AM, Bryce Harrington <bryce@canonical.com> wrote:
> I've been looking into some problems people have been reporting
> upgrading to Karmic with -nvidia installed.
>

<snip>

I filed a bug 456240 regarding the dkms package failing to compile. My
video still works and I haven't had a chance to track the bug down.

I've attached a log to they bug. I'd be happy to provide more
information if necessary.

/Amit

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 11:04 AM
Steve Langasek
 
Default -nvidia upgrade issues

On Wed, Nov 04, 2009 at 05:08:17PM -0600, Mario Limonciello wrote:
> > 438398 - If DKMS fails to build the kernel module, the package upgrade
> > does not kick out. It shows package upgrade as successful. So this
> > leads directly to...

> So the problem with declaring the package as failed if the DKMS build failed
> is that it may actually pass or fail depending on how far along into the
> updates you are.

> Say you are updating to a new linux-headers with a new ABI at the same time
> as installing the NVIDIA package.

> Well if the NVIDIA package is processed first, the headers aren't yet
> installed, so the package will fail during postinst, but as soon as the
> headers are loaded, the kernel postinst runs and the modules get
> successfully built.
> Perhaps a potential solution is to look into whether the headers are yet
> available for this kernel, and if they aren't don't let the DKMS build fail
> cause the postinst to fail, but in any other scenario let the postinst fail.

I wonder if a dpkg trigger wouldn't help here for lucid (not for SRU): each
dkms module package registers its interest in an appropriate file pattern,
and at the end of the corresponding dpkg run the trigger fires to try to do
the module compilation? This would have the advantage that dpkg would then
have information about exactly which dkms packages failed to build, but I
haven't thought this through completely to be sure it's worth doing and
doesn't have any major design pitfalls.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
slangasek@ubuntu.com vorlon@debian.org
--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 
Old 11-05-2009, 12:29 PM
Mario Limonciello
 
Default -nvidia upgrade issues

Hi Bryce:

On Thu, Nov 5, 2009 at 01:44, Bryce Harrington <bryce@canonical.com> wrote:


Btw, in poking around in dkms.conf I noticed this:



PACKAGE_VERSION="185.18.31"



Shouldn't that be 185.18.36? *Or am I misunderstanding the purpose of

this file?


If you have encountered a scenario where that doesn't reflect the version installed, that's a bug for sure in the nvidia driver package you are working with, and I am certain there will be future problems on such a system.



--
Mario Limonciello
superm1@gmail.com
Sent from Austin, Texas, United States
--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
 

Thread Tools




All times are GMT. The time now is 12:45 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org