opr0R3UT.pdf

(25005 KB) Pobierz
Microsoft Word - Dell downclocking v1.1.doc
Performance loss during normal operation in a Dell Latitude E6500
laptop due to processor and bus clock throttling
Randall Cotton - recotton@earthlink.net
Revision 1.1 - July 31, 2009
© 2009 Randall Cotton
Introduction:
This article describes how processor performance can decline by more than 95% in a Dell Latitude E6500
running Windows XP Professional (Service Pack 3) by running routine software at normal operating
temperatures due to activation of the following clock throttling mechanisms present in Intel Core 2 Duo
processors:
I.
Performance State Transitions
II.
Software-controlled Clock Modulation (also called On-Demand Clock Modulation)
These clock throttling features are designed to be invoked to conserve energy when a processor is idle or to
head off an overheating condition. However, in the tested Dell Latitude E6500, they are engaged at normal
operating temperatures even when demand for processing power is high. This can result in severe loss of
performance under routine use at normal ambient room temperatures. This loss of performance can remain until
the system is rebooted (and, since it is temperature-dependent, the performance loss can conceivably remain
across reboots).
To illustrate and reliably reproduce this phenomenon, various methods are presented that raise internal system
operating temperatures enough to cause the throttling mechanisms to be invoked while monitoring key
parameters such as processor temperatures and operating frequencies. In some cases, freely available third-party
processor-stressing software is used, but in other cases, even simple Windows-native processes like Calculator
and a 4-line looped batch file that outputs scrolling lines of text are enough to trigger substantial loss of
processor performance. In the worst case, all of the following takes place:
I. Reduction of internal CPU clock frequency from 2261 MHz to 798 MHz
II. Reduction of FSB (Front Side Bus) clock frequency from 266MHz to 133MHz
III. In addition to the above, overall reduction in the CPU effective clock rate by a factor of
eight due to Software-controlled Clock Modulation.
note: the term CPU stands for Central Processing Unit, referring to a PC’s main processor chip
note: the term Front Side Bus is the technical name for the main data path between the CPU and the rest of the system
Even setting aside the negative performance effect of FSB downshifting in II above, the effective processing
power is reduced to 1/8 of 798 Mhz = 100 MHz. This is a reduction to less than 5% of full capacity (100/2261 =
4.42%).
It is noteworthy that all performance loss takes place with neither notice nor explanation to the user.
Background: Processor Clock-throttling Mechanisms
The Dell Latitude E6500 features an Intel Core 2 Duo processor, which offers several mechanisms that can be
used (typically by operating systems such as Microsoft Windows) for both thermal management and power
conservation. Background information follows on the two mechanisms already mentioned.
A. Performance State Transitions
Many of Intel’s CPUs are designed to run at a variety of performance states. A performance state, or P-state,
is generally a combination of processor frequency (such as 2.26GHz) and a voltage level (such as 1.2V) 1 .
Lower frequencies provide less processing performance, but require lower voltages supplied to the
processor and therefore consume less electrical power. Typically, Intel CPU’s in PCs use 4 performance
states. There can be a wide range of processor performance among these states. In the Dell Latitude E6500
tested for this article under Windows XP, the following states are used by default:
Performance
State
FSB Clock
Frequency
CPU Clock Frequency
Est. CPU
Voltage
P0
266 MHz
2261 MHz (8.5 x FSB frequency)
with Intel Dynamic Acceleration
1.2V
P1
266 MHz
2261 MHz (8.5 x FSB frequency)
without Intel Dynamic Acceleration
1.1375V
P2
266 MHz
1596 MHz (6 x FSB frequency)
1.0V
P3
133 MHz
798 MHz (6 x FSB frequency)
0.925V
Note: The terminology P0, P1, etc., referring to P-state 0, P-state 1, etc., is from the ACPI standard – Advanced Configuration and Power
Interface. That standard and its relevance are described later in this article.
Note: Intel Dynamic Acceleration refers to a feature whereby one of the two processing cores is clocked a little higher than the other when that
other processor is not under much load – in this case, the boost to one of the cores is to 2394 MHz (9 x 266 MHz FSB frequency) 2 .
Note: CPU clock frequencies are derived from the Front Side Bus frequency. Traditionally, the CPU frequency has been an integer multiple of
the FSB frequency, though some Core 2 Duo processors support half-step multipliers such as 8.5 in this case.
Note: Performance State P3 is achieved not by slowing just the CPU clock, but rather by slowing the Front Side Bus (which necessarily slows the
CPU clock by the same factor). This provides for even greater power savings (and even lower performance) than by just slowing the CPU clock
alone. Intel calls this technique Dynamic FSB Frequency Switching 3 .
Note: These aren’t the only P-states that are technically possible with this processor. They just constitute the states being used by the Dell
Latitude E6500 under test.
Note: P1 is sometimes referred to as HFM (Highest Frequency Mode) by Intel documentation. Likewise, P2 is sometimes referred to as LFM
(Lowest Frequency Mode) and P3 is referred to as SLFM (Super Low Frequency Mode) with the “Super” referring to Dynamic FSB Frequency
Switching.
Sometimes, transitions to lower-power P-states are made in order to decrease power consumption when
processing demand is low. As soon as processing demand increases enough to warrant a higher-performance
P-state, the necessary upclocking transition is made nearly instantaneously. Intel calls this power-saving
technique Speedstep Technology 4 . But lower-performance P-states can also be used on a sustained basis for
the purpose of lowering operating temperatures even under high processing demand.
B. Software-controlled Clock Modulation
This is a mechanism to reduce the effective clock rate of the CPU by slicing up time into tiny equal intervals
and then only running the CPU clock (and hence, the processor) during a fraction of each time slice. The
fraction can be one of eight settings 1/8 through 8/8 = 1. That is, the clock runs either 8/8 of the time (all the
time), 7/8 of the time, etc. down to 1/8 of the time, depending on what setting is chosen by software. So in
the extreme case, if the setting is 1/8, the processor is effectively only running at 1/8 th or 12.5% of the speed
it was before Software-controlled Clock Modulation was engaged 5 .
196530858.005.png 196530858.006.png
Normal operating temperatures
In general, CPUs and GPUs (Graphics Processing Units) generate the highest temperatures in a PC. However,
these chips are both designed to run without problem even at fairly high temperatures. Normal operating
temperatures range from around 35°C when idle to the 80s or even 90s Celsius under heavy load. All processors
of any type are designed to operate normally under a specified range of temperatures. For example, the Intel
Core 2 Duo P8400 processor in the system under test is rated for normal operation up to 105° Celsius 6 . Though
specifications for the NVIDIA Quadro NVS 160M processor are not available online, GPU chip technology is
similar to CPU technology and GPUs are designed to operate in similar temperature ranges (sometimes at even
higher temperatures than a system’s CPU).
While high temperatures beyond the rated operating range can result in processing errors and extreme
temperatures can result in physical damage, modern-day CPUs and GPUs are commonly designed with built-in
temperature sensors, automatic clock throttling capabilities and emergency shutdown mechanisms. For example,
the Intel Core 2 Duo P8400 processor under test is capable of automatically engaging both clock-throttling
mechanisms described above in the event of overheating (once the internally measured temperature exceeds the
rated maximum of 105° Celsius). The processor also has an emergency shutdown feature, which halts all
operation (and requires power to be cut) once the internally measured temperature reaches around 125° Celsius 7 .
How sustained clock throttling occurs at normal operating temperatures
In the system under test, once internal temperatures reach a given point (still well within normal operating
range), the system is throttled first by transitioning to lower-performance P-states. As long as the temperature
remains elevated enough, lower and lower performance P-states are activated in succession at 30 second
intervals. Assuming the system starts in state P0, it first transitions to P1, then 30 seconds later to P2 and 30
seconds after that to P3. It’s possible for the system to throttle down only to P1 or P2 and then hold there,
foregoing any further throttling so long as the internal temperature declines enough. However, if the lowest-
performing P-state is reached and the system still determines the temperature is too high, then it will begin
Software-controlled Clock Throttling, successively cutting the effective CPU clock (and thus, the CPU’s
performance) in the available 1/8 increments, as previously described. This also happens at intervals of 30
seconds and continues as long as the system determines the temperature is too high. Once the system progresses
through all available Software-controlled Clock Throttling settings, the system effects no further throttling. In
total, there are 10 incremental steps, progressively applied at 30 second intervals, that are used to throttle the
processing power of the system:
Throttling action
Effective processing power (frequency)
1. Transition from state P0 to P1
2261 MHz
2. Transition from state P1 to P2
1596 MHz
3. Transition from state P2 to P3
798 MHz (with FSB frequency cut in half)
4. Clock Throttling at 7/8 capacity of state P3
700 MHz (with FSB frequency still reduced by half)
5. Clock Throttling at 6/8 capacity of state P3
600 MHz (with FSB frequency still reduced by half)
6. Clock Throttling at 5/8 capacity of state P3
500 MHz (with FSB frequency still reduced by half)
7. Clock Throttling at 4/8 capacity of state P3
400 MHz (with FSB frequency still reduced by half)
8. Clock Throttling at 3/8 capacity of state P3
300 MHz (with FSB frequency still reduced by half)
9. Clock Throttling at 2/8 capacity of state P3
200 MHz (with FSB frequency still reduced by half)
10. Clock Throttling at 1/8 capacity of state P3
100 MHz (with FSB frequency still reduced by half)
While this aggressive degradation of the system’s capabilities might be warranted if internal temperatures were
so high that they might cause damage or malfunction, unfortunately in the system under test, these steps are
activated at normal internal operating temperatures, in some cases even well below 70° Celsius. It’s particularly
196530858.007.png 196530858.008.png 196530858.001.png 196530858.002.png
problematic that the user receives no notification regarding these drastic changes which severely impact
performance. As a further complication, even though the throttling action brings down temperatures
dramatically, performance levels are sometimes not restored for a long time, requiring a system reboot to clear
the throttled condition.
The possible role of ACPI
The Advanced Configuration and Power Interface is an interface specification standard promoted by Intel,
Microsoft, Phoenix Technologies and other corporations which allows operating systems to communicate in a
standardized way with physical hardware devices in a computer system. It provides a robust and uniform way
for operating systems such as Windows to detect, monitor and configure a wide spectrum of devices including
very low-level hardware such as power supplies, batteries, cooling fans and processor chips. In a system that is
fully compliant with ACPI, configuration of such devices is performed exclusively through the operating
system via the ACPI interface 8 .
The throttling behavior observed would be consistent with an ACPI “passive cooling” policy. Passive cooling is
defined in ACPI as a method whereby the “OS reduces the power consumption of devices at the cost of system
performance to reduce the temperature of the machine.” In the standard, a temperature threshold _PSV is
defined beyond which passive cooling is engaged. ACPI suggests providing hysteresis by changing _PSV to a
lower temperature once the initial threshold is reached and using that new value as a target temperature while
passive cooling is engaged. An equation is even suggested for how to adjust the system’s performance to
achieve the lower temperature (using vendor-provided constants _TC1 and _TC2). The standard provides for
periodic polling of the system temperature for the purposes of passive cooling using a sampling period _TSP 9 .
While it’s not practically possible to independently confirm conclusively that this system is fully compliant with
ACPI, or even that this system’s performance loss is being effected through ACPI (particularly since passive
cooling is not an absolute requirement for ACPI compliance), there are a number of facts that strongly suggest
both may be the case.
2. Microsoft (the operating system vendor), Intel (the CPU vendor) and Phoenix Technologies (the BIOS
vendor) are all promoters of the ACPI standard. Phoenix Technologies can be confirmed as the BIOS
vendor for the Dell Latitude E6500 using a BIOS scan at address 0xF0000 (see screen capture image
below)
1. The strings “_TC1”, “_TC2”, “_PSV” and “_TSP”, which are particular to the “passive cooling” option
of the ACPI standard, all appear in the file ACPI.SYS file under
C:\WINDOWS\SYSTEM32\DRIVERS.
196530858.003.png
3. The Phoenix BIOS in the Dell Latitude E6500 contains the expected ACPI tables and code required by
the standard (Root System Description Pointer at address 0xFB9C0, Extended System Description Table
at address 0xDF451E00, Fixed ACPI Description Table at address 0xDF451C9C, Differentiated System
Descriptor Table at address 0xDF452400, Secondary System Description Table at address 0xDF45032D,
etc.). A screen capture image of the beginning of the Differentiated System Descriptor Table appears
below:
Illustration: maximum performance loss due to clock throttling at normal operating temperatures
Following is an illustrated example of exactly how the Dell Latitude E6500 is progressively throttled to the
point that it is operating at less than 5% of processor capacity. Screen captures are used to chronicle the process.
First, here is a more complete description of the test system:
Dell Latitude E6500
Intel Core 2 Duo P8400 processor, 2.26Ghz
4GB RAM
NVIDIA Quadro NVS 160M graphics
Dell BIOS version A13
E/Port Plus Advanced Port Replicator (Docking Station)
dual Dell E248WFP monitors (only one monitor used in some tests)
Windows XP Service Pack 3
Configuration note: A key adjustment for these tests was to switch to the “Always On” power scheme in the “Power Options”
Control Panel window. Choosing the “Always On” power scheme disables Intel’s Speedstep feature which temporarily
downclocks the processor (by transitioning among the various P-states) when it is idle. To avoid confusion about what caused
a P-state change, it was important to disable Speedstep.
196530858.004.png
Zgłoś jeśli naruszono regulamin