During AP bringup, just after switching to long mode, APs will do some
cpuid calls to verify that the extended topology leaf (0xB) is available
so they can fetch their x2 APIC IDs from it. In the case of SEV-ES,
these cpuid instructions must be handled by direct use of the GHCB MSR
protocol to fetch the values from the hypervisor, since a #VC handler
is not yet available due to the AP's stack not being set up yet.
For SEV-SNP, rather than relying on the GHCB MSR protocol, it is
expected that these values would be obtained from the SEV-SNP CPUID
table instead. The actual x2 APIC ID (and 8-bit APIC IDs) would still
be fetched from hypervisor using the GHCB MSR protocol however, so
introducing support for the SEV-SNP CPUID table in that part of the AP
bring-up code would only be to handle the checks/validation of the
extended topology leaf.
Rather than introducing all the added complexity needed to handle these
checks via the CPUID table, instead let the BSP do the check in advance,
since it can make use of the #VC handler to avoid the need to scan the
SNP CPUID table directly, and add a flag in ExchangeInfo to communicate
the result of this check to APs.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Min Xu <min.m.xu@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Ray Ni <ray.ni@intel.com>
Suggested-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3634
The memory allocated through "PeiAllocatePool" is located in HOB, and
in DXE phase, the HOB will be migrated to a different location.
After the migration, the data stored in the HOB stays the same, but the
address of pointer to the memory(such as the pointers in ACPI_CPU_DATA
structure) changes, which may cause "PiSmmCpuDxeSmm" driver can't find
the memory(the pointers in ACPI_CPU_DATA structure) that allocated in
"PeiRegisterCpuFeaturesLib", so use "PeiAllocatePages" to allocate
memory instead.
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3492
Currently SecCore.inf having the resetvector code under IA32. if the
user wants to use both SecCore and UefiCpuPkg ResetVector it's not
possible, since SecCore and ResetVector(VTF0.INF/ResetVector.inf)
are sharing the same GUID which is BFV. to overcome this issue we can
create the Duplicate version of the SecCore.inf as SecCoreNative.inf
which contains pure SecCore Native functionality without resetvector.
SecCoreNative.inf should have the Unique GUID so that it can be used
along with UefiCpuPkg ResetVector in there implementation.
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Debkumar De <debkumar.de@intel.com>
Cc: Harry Han <harry.han@intel.com>
Cc: Catharine West <catharine.west@intel.com>
Cc: Digant H Solanki <digant.h.solanki@intel.com>
Cc: Sangeetha V <sangeetha.v@intel.com>
Signed-off-by: Ashraf Ali S <ashraf.ali.s@intel.com>
REF:https://bugzilla.tianocore.org/show_bug.cgi?id=3473
X64 Reset Vector Code can access the memory range till 4GB using the
Linear-Address Translation to a 2-MByte Page, when user wants to use
more than 4G using 2M Page it will leads to use more number of Page
table entries. using the 1-GByte Page table user can use more than
4G Memory by reducing the page table entries using 1-GByte Page,
this patch attached can access memory range till 512GByte via Linear-
Address Translation to a 1-GByte Page.
Build Tool: if the nasm is not found it will throw Build errors like
FileNotFoundError: [WinError 2]The system cannot find the file specified
run the command wil try except block to get meaningful error message
Test Result: Tested in both Simulation environment and Hardware
both works fine without any issues.
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Debkumar De <debkumar.de@intel.com>
Cc: Harry Han <harry.han@intel.com>
Cc: Catharine West <catharine.west@intel.com>
Cc: Sangeetha V <sangeetha.v@intel.com>
Cc: Rangasai V Chaganty <rangasai.v.chaganty@intel.com>
Cc: Sahil Dureja <sahil.dureja@intel.com>
Signed-off-by: Ashraf Ali S <ashraf.ali.s@intel.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3621
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3631
Current CPU feature initialization design:
During normal boot, CpuFeaturesPei module (inside FSP) initializes the
CPU features. During S3 boot, CpuFeaturesPei module does nothing, and
CpuSmm driver (in SMRAM) initializes CPU features instead.
This code change prevents CpuSmm driver from re-initializing CPU
features during S3 resume if CpuFeaturesPei module has done the same
initialization.
In addition, EDK2 contains DxeIpl PEIM that calls S3RestoreConfig2 PPI
during S3 boot and this PPI eventually calls CpuSmm driver (in SMRAM) to
initialize the CPU features, so "EDK2 + FSP" does not have the CPU
feature initialization issue during S3 boot. But "coreboot" does not
contain DxeIpl PEIM and the issue appears, unless
"PcdCpuFeaturesInitOnS3Resume" is set to TRUE.
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
When enter SMM exception, there will be a stack switch only if the IST
field of the interrupt gate is set. When CET shadow stack feature is
enabled, if there is a stack switch between SMM exception and SMM, the
shadow stack token busy bit needs to be cleared when return from SMM
exception to SMM. In UEFI BIOS, only page fault exception does the stack
swith when SMM shack guard feature is enabled. The condition of clear
shadow stack token busy bit should be SMM stack guard enabled, CET shadows
stack feature enabled and page fault exception.
The shadow stack token should be initialized by UINT64.
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3462
Signed-off-by: Sheng Wei <w.sheng@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Qihua Zhuang <qihua.zhuang@intel.com>
Cc: Daquan Dong <daquan.dong@intel.com>
Cc: Justin Tong <justin.tong@intel.com>
Cc: Tom Xu <tom.xu@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3324
The SEV-ES stacks currently share a page with the reset code and data.
Separate the SEV-ES stacks from the reset vector code and data to avoid
possible stack overflows from overwriting the code and/or data.
When SEV-ES is enabled, invoke the GetWakeupBuffer() routine a second time
to allocate a new area, below the reset vector and data.
Both the PEI and DXE versions of GetWakeupBuffer() are changed so that
when PcdSevEsIsEnabled is true, they will track the previous reset buffer
allocation in order to ensure that the new buffer allocation is below the
previous allocation. When PcdSevEsIsEnabled is false, the original logic
is followed.
Fixes: 7b7508ad78
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Marvin Häuser <mhaeuser@posteo.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <3cae2ac836884b131725866264e0a0e1897052de.1621024125.git.thomas.lendacky@amd.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
5-level paging can be enabled on CPU which supports up to 52 physical
address size. But when the feature was enabled, the 48 address size
limit was not removed and the 5-level paging testing didn't access
address >= 2^48. So the issue wasn't detected until recently an
address >= 2^48 is accessed.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3300
Current implementation of SetStaticPageTable routine in PiSmmCpuDxeSmm
driver will check a global variable mPhysicalAddressBits, and eventually
cap any value larger than 39 at 39.
This global variable is used in ConvertMemoryPageAttributes, which backs
SmmSetMemoryAttributes and SmmClearMemoryAttributes. Thus for a processor
that supports more than 39 bits width, trying to mark page table regions
higher than 39-bit will always return EFI_UNSUPPROTED.
This change updated the interface of SetStaticPageTable function to take
PhysicalAddressBits as an input parameter, in order to avoid changing/
accessing the global variable.
Cc: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Fixes: 4eee0cc7cc
Signed-off-by: Kun Qin <kuqin12@gmail.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=2832
1. Remove PEI instance(PeiCpuTimerLib).
PeiCpuTimerLib is currently designed to save time by getting CPU TSC
frequency from Hob. BaseCpuTimerLib is designed to calculate TSC frequency
by using CPUID[15h] each time.
The time it takes to find CpuCrystalFrequencyHob (about 2000ns) is much
longer than it takes to calculate TSC frequency with CPUID[15h] (about
450ns), which means using BaseCpuTimerLib to trigger a delay is more
accurate than using PeiCpuTimerLib, recommend to use BaseCpuTimerLib
instead of PeiCpuTimerLib.
2. Remove DXE instance(DxeCpuTimerLib).
DxeCpuTimerLib is designed to calculate TSC frequency with CPUID[15h] in
its constructor function, then save it in a global variable. For this
design, once the driver containing this instance is running, this
constructor function is called, it will take extra time to calculate TSC
frequency.
The time it takes to get TSC frequency from global variable is shorter
than it takes to calculate TSC frequency with CPUID[15h], but 450ns is a
short time, the impact on the platform is very limited.
In addition, in order to simplify the code, recommend to use
BaseCpuTimerLib instead of DxeCpuTimerLib.
I did some experiments on one server platform and collected following data:
1. Average time required to find CpuCrystalFrequencyHob: about 2000 ns.
2. Average time required to find the last Hob: about 2700 ns.
2. Average time required to calculate TSC frequency: about 450 ns.
Reference code:
//
// Calculate average time required to find Hob.
//
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] GetPerformanceCounterFrequency - GetFirstGuidHob (1000 cycles)\n"));
Ticks1 = AsmReadTsc();
for (i = 0; i < 1000; i++) {
GuidHob = GetFirstGuidHob (&mCpuCrystalFrequencyHobGuid);
}
Ticks2 = AsmReadTsc();
if (GuidHob == NULL) {
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - CpuCrystalFrequencyHob can not be found!\n"));
} else {
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - Average time required to find Hob = %d ns\n", \
DivU64x32(DivU64x64Remainder(MultU64x32((Ticks2 - Ticks1), 1000000000), *CpuCrystalCounterFrequency, NULL), 1000)));
}
//
// Calculate average time required to calculate CPU frequency.
//
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] GetPerformanceCounterFrequency - CpuidCoreClockCalculateTscFrequency (1000 cycles)\n"));
Ticks1 = AsmReadTsc();
for (i = 0; i < 1000; i++) {
Freq = CpuidCoreClockCalculateTscFrequency ();
}
Ticks2 = AsmReadTsc();
DEBUG((DEBUG_ERROR, "[PeiCpuTimerLib] - Average time required to calculate TSC frequency = %d ns\n", \
DivU64x32(DivU64x64Remainder(MultU64x32((Ticks2 - Ticks1), 1000000000), *CpuCrystalCounterFrequency, NULL), 1000)));
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
The comments in PiSmmCommunicationPei.c describe the whole memory
layout of the SMRAM regarding the SMM communication.
But SHA-1: 8b1d149390
PiSmmCommunicationSmm: Deprecate SMM Communication ACPI Table
removed the code that produces the ACPI Table.
This change updates the accordingly comments.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3233
GDT needs to be allocated below 4GB in 64bit environment
because AP needs it for entering to protected mode.
CPU running in big real mode cannot access above 4GB GDT.
But CpuDxe driver contains below code:
gdt = AllocateRuntimePool (sizeof (GdtTemplate) + 8);
.....
gdtPtr.Base = (UINT32)(UINTN)(VOID*) gdt;
The AllocateRuntimePool() may allocate memory above 4GB.
Thus, we cannot use AllocateRuntimePool (), instead,
we should use AllocatePages() to make sure GDT is below 4GB space.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
MpInitLib contains a function MicrocodeDetect() which is called by
all threads as an AP procedure.
Today this function contains below code:
if (CurrentRevision != LatestRevision) {
AcquireSpinLock(&CpuMpData->MpLock);
DEBUG ((
EFI_D_ERROR,
"Updated microcode signature [0x%08x] does not match \
loaded microcode signature [0x%08x]\n",
CurrentRevision, LatestRevision
));
ReleaseSpinLock(&CpuMpData->MpLock);
}
When the if-check is passed, the code may call into PEI services:
1. AcquireSpinLock
When the PcdSpinTimeout is not 0, TimerLib
GetPerformanceCounterProperties() is called. And some of the
TimerLib implementations would get the information cached in
HOB. But AP procedure cannot call PEI services to retrieve the
HOB list.
2. DEBUG
Certain DebugLib relies on ReportStatusCode services and the
ReportStatusCode PPI is retrieved through the PEI services.
DebugLibSerialPort should be used.
But when SerialPortLib is implemented to depend on PEI services,
even using DebugLibSerialPort can still cause AP calls PEI
services resulting hang.
It causes a lot of debugging effort on the platform side.
There are 2 options to fix the problem:
1. make sure platform DSC chooses the proper DebugLib and set the
PcdSpinTimeout to 0. So that AcquireSpinLock and DEBUG don't call
PEI services.
2. remove the AcquireSpinLock and DEBUG call from the procedure.
Option #2 is preferred because it's not practical to ask every
platform DSC to be written properly.
Following option #2, there are two sub-options:
2.A. Just remove the if-check.
2.B. Capture the CurrentRevision and ExpectedRevision in the memory
for each AP and print them together from BSP.
The patch follows option 2.B.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>