Today's implementation allocates below 1MB memory for the 16bit, 32bit
and 64bit code.
But it's not necessary since now the 32bit and 64bit code run at high
memory no matter in PEI and DXE phase.
The patch simplifies the logic to remove the code that handles the
case when WakeupBufferHigh is 0.
It also reduce the memory foot print under 1MB by allocating
memory for 16bit code only.
MP_CPU_EXCHANGE_INFO is still under 1MB which is immediate
after the 16bit code.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
global in NASM file is used for symbols that are
referenced in C files.
Remove unneeded global keyword in NASM file.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Today's implementation assumes PEI phase runs at 32bit so
the execution-disable feature is not applicable.
It's not always TRUE.
The patch allocates 32bit&64bit code buffer for PEI phase as well.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Today InitializeCpuExceptionHandlersEx is called from three modules:
1. DxeCore (links to DxeCpuExceptionHandlerLib)
DxeCore expects it initializes the IDT entries as well as
assigning separate stacks for #DF and #PF.
2. CpuMpPei (links to PeiCpuExceptionHandlerLib)
and CpuDxe (links to DxeCpuExceptionHandlerLib)
It's called for each thread for only assigning separate stacks for
#DF and #PF. The IDT entries initialization is skipped because
caller sets InitData->X64.InitDefaultHandlers to FALSE.
Additionally, SecPeiCpuExceptionHandlerLib, SmmCpuExceptionHandlerLib
also implement such API and the behavior of the API is simply to initialize
IDT entries only.
Because it mixes the IDT entries initialization and separate stacks
assignment for certain exception handlers together, in order to know
whether the function call only initializes IDT entries, or assigns stacks,
we need to check:
1. value of InitData->X64.InitDefaultHandlers
2. library instance
This patch cleans up the code to separate the stack assignment to a new API:
InitializeSeparateExceptionStacks().
Only when caller calls the new API, the separate stacks are assigned.
With this change, the SecPei and Smm instance can return unsupported which
gives caller a very clear status.
The old API InitializeCpuExceptionHandlersEx() is removed in this patch.
Because no platform module is consuming the old API, the impact is none.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Jian J Wang <jian.j.wang@intel.com>
InitializeCpuExceptionHandlers() expects caller allocates IDT while
InitializeCpuInterruptHandlers() allocates 256 IDT entries itself.
InitializeCpuExceptionHandlers() fills max 32 IDT entries allocated
by caller. If caller allocates 10 entries, the API just fills 10 IDT
entries.
The inconsistency between the two APIs makes code hard to
unerstand and hard to share.
Because there is only one caller (CpuDxe) for
InitializeCpuInterruptHandler(), this patch updates CpuDxe driver
to allocates 256 IDT entries then call
InitializeCpuExceptionHandlers().
This is also a backward compatible change.
With this change, InitializeCpuInterruptHandlers() is removed
completely.
And InitializeCpuExceptionHandlers() fills max 32 entries for PEI
and SMM instance, max 256 entries for DXE instance.
Such behavior matches to the original one.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Additionally removed two useless global variables:
"SPIN_LOCK mDisplayMessageSpinLock" from SMM instance.
"UINTN mEnabledInterruptNum" from DXE instance.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Today the DXE instance allocates code page and then copies the IDT
vectors to the allocated code page. Then it fixes up the vector number
in the IDT vector.
But if we update the NASM file to generate 256 IDT vectors, there is
no need to do the copy and fix-up.
A side effect is 4096 bytes (HOOKAFTER_STUB_SIZE * 256) is used for
256 IDT vectors while 32 IDT vectors only require 512 bytes without
this change, in following library instances:
1. 32bit SecPeiCpuExceptionHandlerLib and PeiCpuExceptionHandlerLib
2. 64bit PeiCpuExceptionHandlerLib
But considering the code logic simplification, 3.5K extra space is
not a big deal.
If 3.5K is too much, we can enhance the code further to generate 32
vectors for above mentioned library instances.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Jian J Wang <jian.j.wang@intel.com>
Acked-by: Eric Dong <eric.dong@intel.com>
There are two libraries: MdePkg/CpuLib and UefiCpuPkg/UefiCpuLib and
UefiCpuPkg/UefiCpuLib will be merged to MdePkg/CpuLib. To avoid build
failure, add CpuLib dependency to all modules that depend on UefiCpuLib.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Yu Pu <yu.pu@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
During AP bringup, just after switching to long mode, APs will do some
cpuid calls to verify that the extended topology leaf (0xB) is available
so they can fetch their x2 APIC IDs from it. In the case of SEV-ES,
these cpuid instructions must be handled by direct use of the GHCB MSR
protocol to fetch the values from the hypervisor, since a #VC handler
is not yet available due to the AP's stack not being set up yet.
For SEV-SNP, rather than relying on the GHCB MSR protocol, it is
expected that these values would be obtained from the SEV-SNP CPUID
table instead. The actual x2 APIC ID (and 8-bit APIC IDs) would still
be fetched from hypervisor using the GHCB MSR protocol however, so
introducing support for the SEV-SNP CPUID table in that part of the AP
bring-up code would only be to handle the checks/validation of the
extended topology leaf.
Rather than introducing all the added complexity needed to handle these
checks via the CPUID table, instead let the BSP do the check in advance,
since it can make use of the #VC handler to avoid the need to scan the
SNP CPUID table directly, and add a flag in ExchangeInfo to communicate
the result of this check to APs.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Min Xu <min.m.xu@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Ray Ni <ray.ni@intel.com>
Suggested-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3634
The memory allocated through "PeiAllocatePool" is located in HOB, and
in DXE phase, the HOB will be migrated to a different location.
After the migration, the data stored in the HOB stays the same, but the
address of pointer to the memory(such as the pointers in ACPI_CPU_DATA
structure) changes, which may cause "PiSmmCpuDxeSmm" driver can't find
the memory(the pointers in ACPI_CPU_DATA structure) that allocated in
"PeiRegisterCpuFeaturesLib", so use "PeiAllocatePages" to allocate
memory instead.
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
When enter SMM exception, there will be a stack switch only if the IST
field of the interrupt gate is set. When CET shadow stack feature is
enabled, if there is a stack switch between SMM exception and SMM, the
shadow stack token busy bit needs to be cleared when return from SMM
exception to SMM. In UEFI BIOS, only page fault exception does the stack
swith when SMM shack guard feature is enabled. The condition of clear
shadow stack token busy bit should be SMM stack guard enabled, CET shadows
stack feature enabled and page fault exception.
The shadow stack token should be initialized by UINT64.
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3462
Signed-off-by: Sheng Wei <w.sheng@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Qihua Zhuang <qihua.zhuang@intel.com>
Cc: Daquan Dong <daquan.dong@intel.com>
Cc: Justin Tong <justin.tong@intel.com>
Cc: Tom Xu <tom.xu@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3324
The SEV-ES stacks currently share a page with the reset code and data.
Separate the SEV-ES stacks from the reset vector code and data to avoid
possible stack overflows from overwriting the code and/or data.
When SEV-ES is enabled, invoke the GetWakeupBuffer() routine a second time
to allocate a new area, below the reset vector and data.
Both the PEI and DXE versions of GetWakeupBuffer() are changed so that
when PcdSevEsIsEnabled is true, they will track the previous reset buffer
allocation in order to ensure that the new buffer allocation is below the
previous allocation. When PcdSevEsIsEnabled is false, the original logic
is followed.
Fixes: 7b7508ad78
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Marvin Häuser <mhaeuser@posteo.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <3cae2ac836884b131725866264e0a0e1897052de.1621024125.git.thomas.lendacky@amd.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2832
1. Remove PEI instance(PeiCpuTimerLib).
PeiCpuTimerLib is currently designed to save time by getting CPU TSC
frequency from Hob. BaseCpuTimerLib is designed to calculate TSC frequency
by using CPUID[15h] each time.
The time it takes to find CpuCrystalFrequencyHob (about 2000ns) is much
longer than it takes to calculate TSC frequency with CPUID[15h] (about
450ns), which means using BaseCpuTimerLib to trigger a delay is more
accurate than using PeiCpuTimerLib, recommend to use BaseCpuTimerLib
instead of PeiCpuTimerLib.
2. Remove DXE instance(DxeCpuTimerLib).
DxeCpuTimerLib is designed to calculate TSC frequency with CPUID[15h] in
its constructor function, then save it in a global variable. For this
design, once the driver containing this instance is running, this
constructor function is called, it will take extra time to calculate TSC
frequency.
The time it takes to get TSC frequency from global variable is shorter
than it takes to calculate TSC frequency with CPUID[15h], but 450ns is a
short time, the impact on the platform is very limited.
In addition, in order to simplify the code, recommend to use
BaseCpuTimerLib instead of DxeCpuTimerLib.
I did some experiments on one server platform and collected following data:
1. Average time required to find CpuCrystalFrequencyHob: about 2000 ns.
2. Average time required to find the last Hob: about 2700 ns.
2. Average time required to calculate TSC frequency: about 450 ns.
Reference code:
//
// Calculate average time required to find Hob.
//
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] GetPerformanceCounterFrequency - GetFirstGuidHob (1000 cycles)\n"));
Ticks1 = AsmReadTsc();
for (i = 0; i < 1000; i++) {
GuidHob = GetFirstGuidHob (&mCpuCrystalFrequencyHobGuid);
}
Ticks2 = AsmReadTsc();
if (GuidHob == NULL) {
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - CpuCrystalFrequencyHob can not be found!\n"));
} else {
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - Average time required to find Hob = %d ns\n", \
DivU64x32(DivU64x64Remainder(MultU64x32((Ticks2 - Ticks1), 1000000000), *CpuCrystalCounterFrequency, NULL), 1000)));
}
//
// Calculate average time required to calculate CPU frequency.
//
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] GetPerformanceCounterFrequency - CpuidCoreClockCalculateTscFrequency (1000 cycles)\n"));
Ticks1 = AsmReadTsc();
for (i = 0; i < 1000; i++) {
Freq = CpuidCoreClockCalculateTscFrequency ();
}
Ticks2 = AsmReadTsc();
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - Average time required to calculate TSC frequency = %d ns\n", \
DivU64x32(DivU64x64Remainder(MultU64x32((Ticks2 - Ticks1), 1000000000), *CpuCrystalCounterFrequency, NULL), 1000)));
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
MpInitLib contains a function MicrocodeDetect() which is called by
all threads as an AP procedure.
Today this function contains below code:
if (CurrentRevision != LatestRevision) {
AcquireSpinLock(&CpuMpData->MpLock);
DEBUG ((
EFI_D_ERROR,
"Updated microcode signature [0x%08x] does not match \
loaded microcode signature [0x%08x]\n",
CurrentRevision, LatestRevision
));
ReleaseSpinLock(&CpuMpData->MpLock);
}
When the if-check is passed, the code may call into PEI services:
1. AcquireSpinLock
When the PcdSpinTimeout is not 0, TimerLib
GetPerformanceCounterProperties() is called. And some of the
TimerLib implementations would get the information cached in
HOB. But AP procedure cannot call PEI services to retrieve the
HOB list.
2. DEBUG
Certain DebugLib relies on ReportStatusCode services and the
ReportStatusCode PPI is retrieved through the PEI services.
DebugLibSerialPort should be used.
But when SerialPortLib is implemented to depend on PEI services,
even using DebugLibSerialPort can still cause AP calls PEI
services resulting hang.
It causes a lot of debugging effort on the platform side.
There are 2 options to fix the problem:
1. make sure platform DSC chooses the proper DebugLib and set the
PcdSpinTimeout to 0. So that AcquireSpinLock and DEBUG don't call
PEI services.
2. remove the AcquireSpinLock and DEBUG call from the procedure.
Option #2 is preferred because it's not practical to ask every
platform DSC to be written properly.
Following option #2, there are two sub-options:
2.A. Just remove the if-check.
2.B. Capture the CurrentRevision and ExpectedRevision in the memory
for each AP and print them together from BSP.
The patch follows option 2.B.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3218
Adds an INF for StandaloneMmCpuFeaturesLib, which supports building
the SmmCpuFeaturesLib code for Standalone MM. Minimal code changes
are made to allow reuse of existing code for Standalone MM.
The original INF file names are left intact (continue to use SMM
terminology) to retain backward compatibility with platforms that
use those INFs. Similarly, the pre-existing C file names are
unchanged to be consistent with the INF file names.
Note that all references in library source files to PiSmm.h have
been changed to PiMm.h for consistency.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com>
Message-Id: <20210217213227.1277-6-mikuback@linux.microsoft.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
There's currently two library instances:
1. SmmCpuFeaturesLib
2. SmmCpuFeaturesLibStm
There's two constructor functions:
1. SmmCpuFeaturesLibConstructor()
2. SmmCpuFeaturesLibStmConstructor()
SmmCpuFeaturesLibConstructor() is called by
SmmCpuFeaturesLibStmConstructor() since the functionality in that
function is required by both library instances.
The declaration for SmmCpuFeaturesLibConstructor() is embedded in
"SmmStm.c" instead of being declared in a header file. Further,
that constructor function is called by the STM specific constructor.
This change moves the common code to a function called
CpuFeaturesLibInitialization() which is declared in an internal
library header file "CpuFeaturesLib.h". Each constructor simply
calls this function to perform the common functionality.
Additionally, SmmCpuFeaturesLibConstructor() is moved from
SmmCpuFeaturesLibNoStm.c into a instance-specific file allowing
SmmCpuFeaturesLibNoStm.c to contain no STM implementation agnostic
to a particular library instance.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <20210217213227.1277-4-mikuback@linux.microsoft.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>