Switch to jiffies for native_sched_clock() when TSC warps
On Wed, 2010-03-10 at 17:56 -0500, Chase Douglas wrote:
> On Wed, Mar 10, 2010 at 5:37 PM, Stefan Bader
> <email@example.com> wrote:
> > Chase Douglas wrote:
> >> I took a look at the x86 code handling the clock to see what could be done
> >> about the TSC warping coming out of resume on some of the newer processors. The
> >> code includes a built-in fallback path that uses the jiffies count instead of
> >> the TSC register if "notsc" is used on the command line. This patch merely sets
> >> this option at runtime if two TSC time stamps differ by more than 6 years.
> >> I'm sending this here first because I've not touched clocking code before. I'm
> >> not sure whether this is a feasible approach, and I would like feedback. Note
> >> that the TSC warping hasn't caused any noticeable issues beyond triggering some
> >> oops messages, so even if there's some skew in the switch from TSC to jiffies
> >> it should hopefully not cause too much of an issue.
> >> The only truly negative outcome I foresee is that the clock won't be stable on
> >> a single CPU.
..and it's hard to determine which CPUs are buggy because they may/may
not have BIOS loaded or kernel loaded microcode fixes.
> Programs needing accurate clock timing can pin themselves to a
> >> single CPU in order to get TSC time stamps that are monotonic and accurate (The
> >> TSC register is per cpu, and there may be skew between CPUs).
..believe me, if it can skew, it will skew.
> However, if the
> >> TSC has warped we are beyond that point anyways. If you have a warping
> >> processor you should run with notsc if you care about accuracy, even though
> >> precision would be reduced.
..and "notsc" impacts on low-latency (see later).
> > From my feeling, to change the sched_clock to jiffies after resume sounds not
> > like a good idea. What was wrong with Colin's approach of just fixing the math?
> Colin's patch fixes soft lockup bugs from being fired. That's fixing
> merely one symptom, but not the real problem.
Well, actually, it's a little more complex than that. Here are some
extra things to throw into the discussion:
1) One some processors, the TSC can set the top 32 bits to 0xffffffff
when coming out of S3. This is a processor issue which may be possible
to fix on a microcode update (loaded from a new BIOS upgrade) or maybe
by installing in the intel-microcode package. So maybe, on some systems
we can advise users to first try the intel-microcode update. If the CPU
is misbehaving, perhaps that's the first thing to fix.
2) While poking around I saw that we get spurious warnings from the
softlockup detection code when the approximated seconds timing tends
towards 0xffffffff because of a math overflow. This will happen whether
or not we use the TSC or not. So it's good to have this fixed anyhow,
even if the bug only happens after thousands of years uptime.
3) Disabling the use of the TSC impacts on low-latency. For example,
when doing udelays the default is to use the TSC based delay which
periodically yields to the scheduler rather than burning up cycles in a
hard loop. The use of the TSC enables the delay loop to figure out how
much delay is left after coming back from the scheduler. With the
non-TSC mode, we burn up cycles and don't yield, so low-latency users
may/will object to this.
> There are other paths
> that are causing oops messages . Further bugs may be caused by TSC
> warping that we just haven't seen yet.
Here is an example of this: Doing an slow I/O operation by default uses
writing to port 0x80 for a small delay. However, the io_delay=udelay
kernel parameter uses a 2 microsecond udelay(), so if the TSC warps
forward then we may pop out of the delay prematurely which could be
If we are *really* unlucky, it is hypothetically possible for the TSC
may warp to 0xffffffffffffffff coming out of S3 and then immediately
wrap to zero. I believe it may be then possible for a TSC based udelay()
to get stuck in the delay loop for possibly years/centuries/millennia.
> Also, the TSC warping issue seems more prevalent than first thought.
> Originally, Colin believed the issue was confined to new Arrandale
> processors, but we're seeing the issue hit Core 2 processors as well
>  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/535077
kernel-team mailing list