FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Ubuntu Kernel Team

 
 
LinkBack Thread Tools
 
Old 01-22-2009, 04:49 PM
Stefan Bader
 
Default SRU Include backport of vmware stable TSC patchset

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/319945

SRU justification:

Impact: Timekeeping relies on TSC which is always stable in the virtualized
vmware environment. But the code to detect TSC as a stable clocksource fails.
Together with the possibility of the acpi_pm timer to wrap around while not
scheduled, this can cause several second clocksews.


Fix: This patch series was backported to Intrepid and will get included with
2.6.29rc3. The changes do not look like they could cause regressions. However
they require an ABI bump for adding elements to per_cpu__cpu_info and
boot_cpu_data.


Stefan

Tim, I already took the liberty of adding your ack based on your previous
comments. The ABI bump would fall together with the 2.6.27.12 update which also
requires one.


--

When all other means of communication fail, try words!


From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: add a synthetic TSC_RELIABLE feature bit

Impact: None, bit reservation only

Add a synthetic TSC_RELIABLE feature bit which will be used to mark
TSC as reliable so that we could skip all the runtime checks for
TSC stablity, which have false positives in virtual environment.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

include/asm-x86/cpufeature.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)


diff --git a/include/asm-x86/cpufeature.h b/include/asm-x86/cpufeature.h
index cfcfb0a..0184826 100644
--- a/include/asm-x86/cpufeature.h
+++ b/include/asm-x86/cpufeature.h
@@ -82,6 +82,7 @@
#define X86_FEATURE_11AP (3*32+19) /* Bad local APIC aka 11AP */
#define X86_FEATURE_NOPL (3*32+20) /* The NOPL (0F 1F) instructions */
#define X86_FEATURE_AMDC1E (3*32+21) /* AMD C1E detected */
+#define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */

/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
#define X86_FEATURE_XMM3 (4*32+ 0) /* Streaming SIMD Extensions-3 */
From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: add X86_FEATURE_HYPERVISOR feature bit

Impact: Number declaration only.

Add X86_FEATURE_HYPERVISOR bit (CPUID level 1, ECX, bit 31).

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

include/asm-x86/cpufeature.h | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)


diff --git a/include/asm-x86/cpufeature.h b/include/asm-x86/cpufeature.h
index 0184826..49e1b59 100644
--- a/include/asm-x86/cpufeature.h
+++ b/include/asm-x86/cpufeature.h
@@ -95,6 +95,7 @@
#define X86_FEATURE_XTPR (4*32+14) /* Send Task Priority Messages */
#define X86_FEATURE_DCA (4*32+18) /* Direct Cache Access */
#define X86_FEATURE_XMM4_2 (4*32+20) /* Streaming SIMD Extensions-4.2 */
+#define X86_FEATURE_HYPERVISOR (4*32+31) /* Running on a hypervisor */

/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
#define X86_FEATURE_XSTORE (5*32+ 2) /* on-CPU RNG present (xstore insn) */
@@ -194,6 +195,7 @@ extern const char * const x86_power_flags[32];
#define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON)
#define cpu_has_pat boot_cpu_has(X86_FEATURE_PAT)
#define cpu_has_xmm4_2 boot_cpu_has(X86_FEATURE_XMM4_2)
+#define cpu_has_hypervisor boot_cpu_has(X86_FEATURE_HYPERVISOR)

#if defined(CONFIG_X86_INVLPG) || defined(CONFIG_X86_64)
# define cpu_has_invlpg 1
From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: Hypervisor detection and get tsc_freq from hypervisor

BumpABI: yes

This changes the ABI hash for per_cpu__cpu_info and boot_cpu_data.

Impact: Changes timebase calibration on Vmware.

v3->v2 : Abstract the hypervisor detection and feature (tsc_freq) request
behind a hypervisor.c file
v2->v1 : Add a x86_hyper_vendor field to the cpuinfo_x86 structure.
This avoids multiple calls to the hypervisor detection function.

This patch adds function to detect if we are running under VMware.
The current way to check if we are on VMware is following,
# check if "hypervisor present bit" is set, if so read the 0x40000000
cpuid leaf and check for "VMwareVMware" signature.
# if the above fails, check the DMI vendors name for "VMware" string
if we find one we query the VMware hypervisor port to check if we are
under VMware.

The DMI + "VMware hypervisor port check" is needed for older VMware products,
which don't implement the hypervisor signature cpuid leaf.
Also note that since we are checking for the DMI signature the hypervisor
port should never be accessed on native hardware.

This patch also adds a hypervisor_get_tsc_freq function, instead of
calibrating the frequency which can be error prone in virtualized
environment, we ask the hypervisor for it. We get the frequency from
the hypervisor by accessing the hypervisor port if we are running on VMware.
Other hypervisors too can add code to the generic routine to get frequency on
their platform.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

arch/x86/kernel/cpu/Makefile | 1
arch/x86/kernel/cpu/common.c | 2 +
arch/x86/kernel/cpu/common_64.c | 2 +
arch/x86/kernel/cpu/hypervisor.c | 48 +++++++++++++++++++++
arch/x86/kernel/cpu/vmware.c | 88 ++++++++++++++++++++++++++++++++++++++
arch/x86/kernel/setup.c | 7 +++
arch/x86/kernel/tsc.c | 9 +++-
include/asm-x86/hypervisor.h | 26 +++++++++++
include/asm-x86/processor.h | 4 ++
include/asm-x86/vmware.h | 26 +++++++++++
10 files changed, 212 insertions(+), 1 deletions(-)
create mode 100644 arch/x86/kernel/cpu/hypervisor.c
create mode 100644 arch/x86/kernel/cpu/vmware.c
create mode 100644 include/asm-x86/hypervisor.h
create mode 100644 include/asm-x86/vmware.h


diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index ee76eaa..0613c56 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -4,6 +4,7 @@

obj-y := intel_cacheinfo.o addon_cpuid_features.o
obj-y += proc.o feature_names.o
+obj-y += vmware.o hypervisor.o

obj-$(CONFIG_X86_32) += common.o bugs.o
obj-$(CONFIG_X86_64) += common_64.o bugs_64.o
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4e456bd..0a10238 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -14,6 +14,7 @@
#include <asm/mce.h>
#include <asm/pat.h>
#include <asm/asm.h>
+#include <asm/hypervisor.h>
#ifdef CONFIG_X86_LOCAL_APIC
#include <asm/mpspec.h>
#include <asm/apic.h>
@@ -505,6 +506,7 @@ static void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
c->x86, c->x86_model);
}

+ init_hypervisor(c);
/*
* On SMP, boot_cpu_data holds the common feature set between
* all CPUs; so make sure that we indicate which features are
diff --git a/arch/x86/kernel/cpu/common_64.c b/arch/x86/kernel/cpu/common_64.c
index a11f5d4..3450af8 100644
--- a/arch/x86/kernel/cpu/common_64.c
+++ b/arch/x86/kernel/cpu/common_64.c
@@ -34,6 +34,7 @@
#include <asm/sections.h>
#include <asm/setup.h>
#include <asm/genapic.h>
+#include <asm/hypervisor.h>

#include "cpu.h"

@@ -384,6 +385,7 @@ static void __cpuinit identify_cpu(struct cpuinfo_x86 *c)

detect_ht(c);

+ init_hypervisor(c);
/*
* On SMP, boot_cpu_data holds the common feature set between
* all CPUs; so make sure that we indicate which features are
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
new file mode 100644
index 0000000..7bd5506
--- /dev/null
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -0,0 +1,48 @@
+/*
+ * Common hypervisor code
+ *
+ * Copyright (C) 2008, VMware, Inc.
+ * Author : Alok N Kataria <akataria@vmware.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+
+#include <asm/processor.h>
+#include <asm/vmware.h>
+
+static inline void __cpuinit
+detect_hypervisor_vendor(struct cpuinfo_x86 *c)
+{
+ if (vmware_platform()) {
+ c->x86_hyper_vendor = X86_HYPER_VENDOR_VMWARE;
+ } else {
+ c->x86_hyper_vendor = X86_HYPER_VENDOR_NONE;
+ }
+}
+
+unsigned long get_hypervisor_tsc_freq(void)
+{
+ if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_VMWARE)
+ return vmware_get_tsc_khz();
+ return 0;
+}
+
+void __cpuinit init_hypervisor(struct cpuinfo_x86 *c)
+{
+ detect_hypervisor_vendor(c);
+}
+
diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
new file mode 100644
index 0000000..d5d1b75
--- /dev/null
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -0,0 +1,88 @@
+/*
+ * VMware Detection code.
+ *
+ * Copyright (C) 2008, VMware, Inc.
+ * Author : Alok N Kataria <akataria@vmware.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+
+#include <linux/dmi.h>
+#include <asm/div64.h>
+
+#define CPUID_VMWARE_INFO_LEAF 0x40000000
+#define VMWARE_HYPERVISOR_MAGIC 0x564D5868
+#define VMWARE_HYPERVISOR_PORT 0x5658
+
+#define VMWARE_PORT_CMD_GETVERSION 10
+#define VMWARE_PORT_CMD_GETHZ 45
+
+#define VMWARE_PORT(cmd, eax, ebx, ecx, edx)
+ __asm__("inl (%%dx)" :
+ "=a"(eax), "=c"(ecx), "=d"(edx), "=b"(ebx) :
+ "0"(VMWARE_HYPERVISOR_MAGIC),
+ "1"(VMWARE_PORT_CMD_##cmd),
+ "2"(VMWARE_HYPERVISOR_PORT), "3"(0) :
+ "memory");
+
+static inline int __vmware_platform(void)
+{
+ uint32_t eax, ebx, ecx, edx;
+ VMWARE_PORT(GETVERSION, eax, ebx, ecx, edx);
+ return eax != (uint32_t)-1 && ebx == VMWARE_HYPERVISOR_MAGIC;
+}
+
+static unsigned long __vmware_get_tsc_khz(void)
+{
+ uint64_t tsc_hz;
+ uint32_t eax, ebx, ecx, edx;
+
+ VMWARE_PORT(GETHZ, eax, ebx, ecx, edx);
+
+ if (eax == (uint32_t)-1)
+ return 0;
+ tsc_hz = eax | (((uint64_t)ebx) << 32);
+ do_div(tsc_hz, 1000);
+ BUG_ON(tsc_hz >> 32);
+ return tsc_hz;
+}
+
+int vmware_platform(void)
+{
+ if (cpu_has_hypervisor) {
+ unsigned int eax, ebx, ecx, edx;
+ char hyper_vendor_id[13];
+
+ cpuid(CPUID_VMWARE_INFO_LEAF, &eax, &ebx, &ecx, &edx);
+ memcpy(hyper_vendor_id + 0, &ebx, 4);
+ memcpy(hyper_vendor_id + 4, &ecx, 4);
+ memcpy(hyper_vendor_id + 8, &edx, 4);
+ hyper_vendor_id[12] = '';
+ if (!strcmp(hyper_vendor_id, "VMwareVMware"))
+ return 1;
+ } else if (dmi_available && dmi_name_in_vendors("VMware") &&
+ __vmware_platform())
+ return 1;
+
+ return 0;
+}
+
+unsigned long vmware_get_tsc_khz(void)
+{
+ BUG_ON(!vmware_platform());
+ return __vmware_get_tsc_khz();
+}
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 6d5a3c4..17021f2 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -98,6 +98,7 @@

#include <mach_apic.h>
#include <asm/paravirt.h>
+#include <asm/hypervisor.h>

#include <asm/percpu.h>
#include <asm/topology.h>
@@ -902,6 +903,12 @@ void __init setup_arch(char **cmdline_p)
e820_reserve_resources();
e820_mark_nosave_regions(max_low_pfn);

+ /*
+ * VMware detection requires dmi to be available, so this
+ * needs to be done after dmi_scan_machine, for the BP.
+ */
+ init_hypervisor(&boot_cpu_data);
+
#ifdef CONFIG_X86_32
request_resource(&iomem_resource, &video_ram_resource);
#endif
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index de850e9..e063537 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -15,6 +15,7 @@
#include <asm/vgtod.h>
#include <asm/time.h>
#include <asm/delay.h>
+#include <asm/hypervisor.h>

unsigned int cpu_khz; /* TSC clocks / usec, not used here */
EXPORT_SYMBOL(cpu_khz);
@@ -189,9 +190,15 @@ unsigned long native_calibrate_tsc(void)
{
u64 tsc1, tsc2, delta, pm1, pm2, hpet1, hpet2;
unsigned long tsc_pit_min = ULONG_MAX, tsc_ref_min = ULONG_MAX;
- unsigned long flags;
+ unsigned long flags, tsc_khz;
int hpet = is_hpet_enabled(), i;

+ tsc_khz = get_hypervisor_tsc_freq();
+ if (tsc_khz) {
+ printk(KERN_INFO "TSC: Frequency read from the hypervisor
");
+ return tsc_khz;
+ }
+
/*
* Run 5 calibration loops to get the lowest frequency value
* (the best estimate). We use two different calibration modes
diff --git a/include/asm-x86/hypervisor.h b/include/asm-x86/hypervisor.h
new file mode 100644
index 0000000..369f5c5
--- /dev/null
+++ b/include/asm-x86/hypervisor.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2008, VMware, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+#ifndef ASM_X86__HYPERVISOR_H
+#define ASM_X86__HYPERVISOR_H
+
+extern unsigned long get_hypervisor_tsc_freq(void);
+extern void init_hypervisor(struct cpuinfo_x86 *c);
+
+#endif
diff --git a/include/asm-x86/processor.h b/include/asm-x86/processor.h
index 4df3e2f..a06d9a5 100644
--- a/include/asm-x86/processor.h
+++ b/include/asm-x86/processor.h
@@ -109,6 +109,7 @@ struct cpuinfo_x86 {
/* Index into per_cpu list: */
u16 cpu_index;
#endif
+ unsigned int x86_hyper_vendor;
} __attribute__((__aligned__(SMP_CACHE_BYTES)));

#define X86_VENDOR_INTEL 0
@@ -122,6 +123,9 @@ struct cpuinfo_x86 {

#define X86_VENDOR_UNKNOWN 0xff

+#define X86_HYPER_VENDOR_NONE 0
+#define X86_HYPER_VENDOR_VMWARE 1
+
/*
* capabilities of CPUs
*/
diff --git a/include/asm-x86/vmware.h b/include/asm-x86/vmware.h
new file mode 100644
index 0000000..02dfea5
--- /dev/null
+++ b/include/asm-x86/vmware.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2008, VMware, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT. See the GNU General Public License for more
+ * details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+#ifndef ASM_X86__VMWARE_H
+#define ASM_X86__VMWARE_H
+
+extern unsigned long vmware_get_tsc_khz(void);
+extern int vmware_platform(void);
+
+#endif
From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: Add a synthetic TSC_RELIABLE feature bit.

Impact: Changes timebase calibration on Vmware.

Use the synthetic TSC_RELIABLE bit to workaround virtualization anomalies.

Virtual TSCs can be kept nearly in sync, but because the virtual TSC
offset is set by software, it's not perfect. So, the TSC
synchronization test can fail. Even then the TSC can be used as a
clocksource since the VMware platform exports a reliable TSC to the
guest for timekeeping purposes. Use this bit to check if we need to
skip the TSC sync checks.

Along with this also set the CONSTANT_TSC bit when on VMware, since we
still want to use TSC as clocksource on VM running over hardware which
has unsynchronized TSC's (opteron's), since the hypervisor will take
care of providing consistent TSC to the guest.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

arch/x86/kernel/cpu/hypervisor.c | 11 ++++++++++-
arch/x86/kernel/cpu/vmware.c | 18 ++++++++++++++++++
arch/x86/kernel/tsc_sync.c | 8 +++++++-
include/asm-x86/vmware.h | 1 +
4 files changed, 36 insertions(+), 2 deletions(-)


diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index 7bd5506..35ae2b7 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -41,8 +41,17 @@ unsigned long get_hypervisor_tsc_freq(void)
return 0;
}

+static inline void __cpuinit
+hypervisor_set_feature_bits(struct cpuinfo_x86 *c)
+{
+ if (boot_cpu_data.x86_hyper_vendor == X86_HYPER_VENDOR_VMWARE) {
+ vmware_set_feature_bits(c);
+ return;
+ }
+}
+
void __cpuinit init_hypervisor(struct cpuinfo_x86 *c)
{
detect_hypervisor_vendor(c);
+ hypervisor_set_feature_bits(c);
}
-
diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index d5d1b75..2ac4394 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -86,3 +86,21 @@ unsigned long vmware_get_tsc_khz(void)
BUG_ON(!vmware_platform());
return __vmware_get_tsc_khz();
}
+
+/*
+ * VMware hypervisor takes care of exporting a reliable TSC to the guest.
+ * Still, due to timing difference when running on virtual cpus, the TSC can
+ * be marked as unstable in some cases. For example, the TSC sync check at
+ * bootup can fail due to a marginal offset between vcpus' TSCs (though the
+ * TSCs do not drift from each other). Also, the ACPI PM timer clocksource
+ * is not suitable as a watchdog when running on a hypervisor because the
+ * kernel may miss a wrap of the counter if the vcpu is descheduled for a
+ * long time. To skip these checks at runtime we set these capability bits,
+ * so that the kernel could just trust the hypervisor with providing a
+ * reliable virtual TSC that is suitable for timekeeping.
+ */
+void __cpuinit vmware_set_feature_bits(struct cpuinfo_x86 *c)
+{
+ set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
+ set_cpu_cap(c, X86_FEATURE_TSC_RELIABLE);
+}
diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index 9ffb01c..5977c40 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -108,6 +108,12 @@ void __cpuinit check_tsc_sync_source(int cpu)
if (unsynchronized_tsc())
return;

+ if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) {
+ printk(KERN_INFO
+ "Skipping synchronization checks as TSC is reliable.
");
+ return;
+ }
+
printk(KERN_INFO "checking TSC synchronization [CPU#%d -> CPU#%d]:",
smp_processor_id(), cpu);

@@ -161,7 +167,7 @@ void __cpuinit check_tsc_sync_target(void)
{
int cpus = 2;

- if (unsynchronized_tsc())
+ if (unsynchronized_tsc() || boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
return;

/*
diff --git a/include/asm-x86/vmware.h b/include/asm-x86/vmware.h
index 02dfea5..c11b7e1 100644
--- a/include/asm-x86/vmware.h
+++ b/include/asm-x86/vmware.h
@@ -22,5 +22,6 @@

extern unsigned long vmware_get_tsc_khz(void);
extern int vmware_platform(void);
+extern void vmware_set_feature_bits(struct cpuinfo_x86 *c);

#endif
From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: Skip verification by the watchdog for TSC clocksource.

Impact: Changes timekeeping on Vmware (or with tsc=reliable).

This is achieved by resetting the CLOCKSOURCE_MUST_VERIFY flag.

We add a tsc=reliable commandline option to enable this.
This enables legacy hardware without HPET, LAPIC, or ACPI timers
to enter high-resolution timer mode.

Along with that have extended this to be used in virtualization environement
too. Now we also set this flag if the X86_FEATURE_TSC_RELIABLE bit is set.

This is important since there is a wrap-around problem with the acpi_pm timer.
The acpi_pm counter is just 24bits and this can overflow in ~4 seconds. With
the NO_HZ kernels in virtualized environment, there can be situations when
the guest is descheduled for longer duration, as a result we may miss the wrap
of the acpi counter. When TSC is used as a clocksource and acpi_pm timer is
being used as the watchdog clocksource this error in acpi_pm results in TSC
being marked as unstable, and essentially results in time dropping in chunks
of 4 seconds whenever this wrap is missed. Since the virtualized TSC is
reliable on VMware, we should always use the TSCs clocksource on VMware, so
we skip the verfication at runtime, by checking for the feature bit.

Since we reset the flag for mgeode systems too, i have combined
the mgeode case with the feature bit check.

Signed-off-by: Jeff Hansen <jhansen@cardaccess-inc.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

Documentation/kernel-parameters.txt | 7 +++++++
arch/x86/kernel/tsc.c | 33 +++++++++++++++++++++------------
2 files changed, 28 insertions(+), 12 deletions(-)


diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 4e0d37d..a3506d2 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2197,6 +2197,13 @@ and is between 256 and 4096 characters. It is defined in the file
Format:
<io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>, <mpu_io>,<mpu_irq>

+ tsc= Disable clocksource-must-verify flag for TSC.
+ Format: <string>
+ [x86] reliable: mark tsc clocksource as reliable, this
+ disables clocksource verification at runtime.
+ Used to enable high-resolution timer mode on older
+ hardware, and in virtualized environment.
+
turbografx.map[2|3]= [HW,JOY]
TurboGraFX parallel port interface
Format:
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index e063537..93a4494 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -32,6 +32,7 @@ static int tsc_unstable;
erroneous rdtsc usage on !cpu_has_tsc processors */
static int tsc_disabled = -1;

+static int tsc_clocksource_reliable;
/*
* Scheduler clock - returns current time in nanosec units.
*/
@@ -99,6 +100,15 @@ int __init notsc_setup(char *str)

__setup("notsc", notsc_setup);

+static int __init tsc_setup(char *str)
+{
+ if (!strcmp(str, "reliable"))
+ tsc_clocksource_reliable = 1;
+ return 1;
+}
+
+__setup("tsc=", tsc_setup);
+
#define MAX_RETRIES 5
#define SMI_TRESHOLD 50000

@@ -564,24 +574,21 @@ static struct dmi_system_id __initdata bad_tsc_dmi_table[] = {
{}
};

-/*
- * Geode_LX - the OLPC CPU has a possibly a very reliable TSC
- */
+static void __init check_system_tsc_reliable(void)
+{
#ifdef CONFIG_MGEODE_LX
-/* RTSC counts during suspend */
+ /* RTSC counts during suspend */
#define RTSC_SUSP 0x100
-
-static void __init check_geode_tsc_reliable(void)
-{
unsigned long res_low, res_high;

rdmsr_safe(MSR_GEODE_BUSCONT_CONF0, &res_low, &res_high);
+ /* Geode_LX - the OLPC CPU has a possibly a very reliable TSC */
if (res_low & RTSC_SUSP)
- clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
-}
-#else
-static inline void check_geode_tsc_reliable(void) { }
+ tsc_clocksource_reliable = 1;
#endif
+ if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
+ tsc_clocksource_reliable = 1;
+}

/*
* Make an educated guess if the TSC is trustworthy and synchronized
@@ -616,6 +623,8 @@ static void __init init_tsc_clocksource(void)
{
clocksource_tsc.mult = clocksource_khz2mult(tsc_khz,
clocksource_tsc.shift);
+ if (tsc_clocksource_reliable)
+ clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
/* lower the rating if we already know its unstable: */
if (check_tsc_unstable()) {
clocksource_tsc.rating = 0;
@@ -676,7 +685,7 @@ void __init tsc_init(void)
if (unsynchronized_tsc())
mark_tsc_unstable("TSCs unsynchronized");

- check_geode_tsc_reliable();
+ check_system_tsc_reliable();
init_tsc_clocksource();
}

From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: VMware: Fix vmware_get_tsc code

Impact: Fix possible failure to calibrate the TSC on Vmware near 4 GHz

The current version of the code to get the tsc frequency from
the VMware hypervisor, will be broken on processor with frequency
(4G-1) HZ, because on such processors eax will have UINT_MAX
and that would be legitimate.
We instead check that EBX did change to decide if we were able to
read the frequency from the hypervisor.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

arch/x86/kernel/cpu/vmware.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)


diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index 2ac4394..a0905ec 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -36,7 +36,7 @@
"=a"(eax), "=c"(ecx), "=d"(edx), "=b"(ebx) :
"0"(VMWARE_HYPERVISOR_MAGIC),
"1"(VMWARE_PORT_CMD_##cmd),
- "2"(VMWARE_HYPERVISOR_PORT), "3"(0) :
+ "2"(VMWARE_HYPERVISOR_PORT), "3"(UINT_MAX) :
"memory");

static inline int __vmware_platform(void)
@@ -53,7 +53,7 @@ static unsigned long __vmware_get_tsc_khz(void)

VMWARE_PORT(GETHZ, eax, ebx, ecx, edx);

- if (eax == (uint32_t)-1)
+ if (ebx == UINT_MAX)
return 0;
tsc_hz = eax | (((uint64_t)ebx) << 32);
do_div(tsc_hz, 1000);
From: Alok Kataria <akataria@vmware.com>
Subject: UBUNTU: x86: vmware: look for DMI string in the product serial key

Impact: Should permit VMware detection on older platforms where the
vendor is changed. Could theoretically cause a regression if some
weird serial number scheme contains the string "VMware" by pure
chance. Seems unlikely, especially with the mixed case.

In some user configured cases, VMware may choose not to put a VMware specific
DMI string, but the product serial key is always there and is VMware specific.
Add a interface to check the serial key, when checking for VMware in the DMI
information.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
---

arch/x86/kernel/cpu/vmware.c | 7 ++++++-
drivers/firmware/dmi_scan.c | 11 +++++++++++
include/linux/dmi.h | 2 ++
3 files changed, 19 insertions(+), 1 deletions(-)


diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index a0905ec..c034bda 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -61,6 +61,11 @@ static unsigned long __vmware_get_tsc_khz(void)
return tsc_hz;
}

+/*
+ * While checking the dmi string infomation, just checking the product
+ * serial key should be enough, as this will always have a VMware
+ * specific string when running under VMware hypervisor.
+ */
int vmware_platform(void)
{
if (cpu_has_hypervisor) {
@@ -74,7 +79,7 @@ int vmware_platform(void)
hyper_vendor_id[12] = '';
if (!strcmp(hyper_vendor_id, "VMwareVMware"))
return 1;
- } else if (dmi_available && dmi_name_in_vendors("VMware") &&
+ } else if (dmi_available && dmi_name_in_serial("VMware") &&
__vmware_platform())
return 1;

diff --git a/drivers/firmware/dmi_scan.c b/drivers/firmware/dmi_scan.c
index 455575b..4dd780c 100644
--- a/drivers/firmware/dmi_scan.c
+++ b/drivers/firmware/dmi_scan.c
@@ -457,6 +457,17 @@ const char *dmi_get_system_info(int field)
}
EXPORT_SYMBOL(dmi_get_system_info);

+/**
+ * dmi_name_in_serial - Check if string is in the DMI product serial
+ * information.
+ */
+int dmi_name_in_serial(const char *str)
+{
+ int f = DMI_PRODUCT_SERIAL;
+ if (dmi_ident[f] && strstr(dmi_ident[f], str))
+ return 1;
+ return 0;
+}

/**
* dmi_name_in_vendors - Check if string is anywhere in the DMI vendor information.
diff --git a/include/linux/dmi.h b/include/linux/dmi.h
index 2a063b6..098e292 100644
--- a/include/linux/dmi.h
+++ b/include/linux/dmi.h
@@ -81,6 +81,7 @@ extern const struct dmi_device * dmi_find_device(int type, const char *name,
extern void dmi_scan_machine(void);
extern int dmi_get_year(int field);
extern int dmi_name_in_vendors(const char *str);
+extern int dmi_name_in_serial(const char *str);
extern int dmi_available;
extern int dmi_walk(void (*decode)(const struct dmi_header *));

@@ -93,6 +94,7 @@ static inline const struct dmi_device * dmi_find_device(int type, const char *na
static inline void dmi_scan_machine(void) { return; }
static inline int dmi_get_year(int year) { return 0; }
static inline int dmi_name_in_vendors(const char *s) { return 0; }
+static inline int dmi_name_in_serial(const char *s) { return 0; }
#define dmi_available 0
static inline int dmi_walk(void (*decode)(const struct dmi_header *))
{ return -1; }
--
kernel-team mailing list
kernel-team@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/kernel-team
 
Old 01-26-2009, 07:01 PM
Stefan Bader
 
Default SRU Include backport of vmware stable TSC patchset

Stefan Bader wrote:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/319945
>
> SRU justification:
>
> Impact: Timekeeping relies on TSC which is always stable in the
> virtualized vmware environment. But the code to detect TSC as a stable
> clocksource fails. Together with the possibility of the acpi_pm timer to
> wrap around while not scheduled, this can cause several second clocksews.
>
> Fix: This patch series was backported to Intrepid and will get included
> with 2.6.29rc3. The changes do not look like they could cause
> regressions. However they require an ABI bump for adding elements to
> per_cpu__cpu_info and boot_cpu_data.
>
> Stefan
>
> Tim, I already took the liberty of adding your ack based on your
> previous comments. The ABI bump would fall together with the 2.6.27.12
> update which also requires one.
>

Applied and pushed

--

When all other means of communication fail, try words!



--
kernel-team mailing list
kernel-team@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/kernel-team
 
Old 02-03-2009, 06:55 AM
Stefan Bader
 
Default SRU Include backport of vmware stable TSC patchset

This is a multi-part message in MIME format.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/319945

This is the port to Jaunty of those patches.

SRU justification:

Impact: Timekeeping relies on TSC which is always stable in the virtualized
vmware environment. But the code to detect TSC as a stable clocksource fails.
Together with the possibility of the acpi_pm timer to wrap around while not
scheduled, this can cause several second clocksews.

Fix: This patch series was backported to Intrepid and will get included with
2.6.29rc3. The changes do not look like they could cause regressions. However
they require an ABI bump for adding elements to per_cpu__cpu_info and
boot_cpu_data.

Stefan
 
Old 02-03-2009, 12:26 PM
Tim Gardner
 
Default SRU Include backport of vmware stable TSC patchset

Stefan Bader wrote:
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/319945
>
> This is the port to Jaunty of those patches.
>
> SRU justification:
>
> Impact: Timekeeping relies on TSC which is always stable in the virtualized
> vmware environment. But the code to detect TSC as a stable clocksource
> fails.
> Together with the possibility of the acpi_pm timer to wrap around while not
> scheduled, this can cause several second clocksews.
>
> Fix: This patch series was backported to Intrepid and will get included
> with
> 2.6.29rc3. The changes do not look like they could cause regressions.
> However
> they require an ABI bump for adding elements to per_cpu__cpu_info and
> boot_cpu_data.
>
> Stefan
>
>
>

applied

--
Tim Gardner tim.gardner@canonical.com

--
kernel-team mailing list
kernel-team@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/kernel-team
 

Thread Tools




All times are GMT. The time now is 04:09 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org