When PageTableMap() is called to create non 1:1 mapping
such as [0, 1G) to [8K, 1G+8K), it should split the page entry to the
4K page level, but old logic has a bug that it just uses 1G page
entry.
The patch fixes the bug.
Signed-off-by: Zhiguang Liu <zhiguang.liu@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
The patch replaces
LinearAddress + Offset == RegionStart
with
((LinearAddress + Offset) & RegionMask) == 0
The replace should not cause any behavior change.
Because:
1. In first loop of while when LinearAddress + Offset == RegionStart,
because the lower "BitStart" bits of RegionStart are all-zero,
all lower "BitStart" bits of (LinearAddress + Offset) are all-zero.
Because all lower "BitStart" bits of RegionMask is all-one and
bits are all-zero, ((LinearAddress + Offset) & RegionMask) == 0.
2. In following loops of the while, even RegionStart is increased
by RegionLength, the lower "BitStart" bits are still all-zero.
So the two expressions still semantically equal to each other.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Zhiguang Liu <zhiguang.liu@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
When LinearAddress or Length is not aligned on 4KB, PageTableMap()
should return Invalid Parameter.
Signed-off-by: Zhiguang Liu <zhiguang.liu@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
The lib includes two APIs:
* PageTableMap
It creates/updates mapping from LA to PA.
The implementation only supports paging structures used in 64bit
mode now. PAE paging structure support will be added in future.
* PageTableParse
It parses the page table and returns the mapping relations in an
array of IA32_MAP_ENTRY.
It passed some stress tests. These test code will be upstreamed in
other patches following edk2 Unit Test framework.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
CPU_EXCEPTION_INIT_DATA is now an internal implementation of
CpuExceptionHandlerLib. Union can be removed since Ia32 and X64 have the
same definition. Also, two fields (Revision and InitDefaultHandlers)are
useless, can be removed.
Cc: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Zhiguang Liu <zhiguang.liu@intel.com>
Currently, "push byte %[Vector]" causes nasm warning when Vector is larger
than 0x7F. This is because push accepts a signed value, and byte means
signed int8. Maximum signed int8 is 0x7F.
When Vector is larger the 0x7F, for example, when Vector is 255, byte 255
turns to -1, and causes the warning "signed byte value exceeds".
To avoid such warning, use dword instead of byte, this will increase 3 bytes
for each IdtVector.
For IA32, the size of IdtVector will increase from 10 bytes to 13 bytes.
For X64, the size of IdtVector will increase from 15 bytes to 18 bytes.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Debkumar De <debkumar.de@intel.com>
Cc: Harry Han <harry.han@intel.com>
Cc: Catharine West <catharine.west@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Signed-off-by: Zhiguang Liu <zhiguang.liu@intel.com>
The AP vector consists of 2 parts:
1. the initial 16-bit code that should be under 1MB and page aligned.
2. the 32-bit/64-bit code that can be anywhere in the memory with any
alignment.
The need of part #2 is because the memory under 1MB is temporary
"stolen" for use and will "give" back after all AP wake up. The range
of memory is not marked as code page in page table. CPU may trigger
exception as soon as NX is enabled.
The part #2 memory allocation can be done in the MpInitLibInitialize.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Today's implementation allocates below 1MB memory for the 16bit, 32bit
and 64bit code.
But it's not necessary since now the 32bit and 64bit code run at high
memory no matter in PEI and DXE phase.
The patch simplifies the logic to remove the code that handles the
case when WakeupBufferHigh is 0.
It also reduce the memory foot print under 1MB by allocating
memory for 16bit code only.
MP_CPU_EXCHANGE_INFO is still under 1MB which is immediate
after the 16bit code.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
global in NASM file is used for symbols that are
referenced in C files.
Remove unneeded global keyword in NASM file.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Today's implementation assumes PEI phase runs at 32bit so
the execution-disable feature is not applicable.
It's not always TRUE.
The patch allocates 32bit&64bit code buffer for PEI phase as well.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Today InitializeCpuExceptionHandlersEx is called from three modules:
1. DxeCore (links to DxeCpuExceptionHandlerLib)
DxeCore expects it initializes the IDT entries as well as
assigning separate stacks for #DF and #PF.
2. CpuMpPei (links to PeiCpuExceptionHandlerLib)
and CpuDxe (links to DxeCpuExceptionHandlerLib)
It's called for each thread for only assigning separate stacks for
#DF and #PF. The IDT entries initialization is skipped because
caller sets InitData->X64.InitDefaultHandlers to FALSE.
Additionally, SecPeiCpuExceptionHandlerLib, SmmCpuExceptionHandlerLib
also implement such API and the behavior of the API is simply to initialize
IDT entries only.
Because it mixes the IDT entries initialization and separate stacks
assignment for certain exception handlers together, in order to know
whether the function call only initializes IDT entries, or assigns stacks,
we need to check:
1. value of InitData->X64.InitDefaultHandlers
2. library instance
This patch cleans up the code to separate the stack assignment to a new API:
InitializeSeparateExceptionStacks().
Only when caller calls the new API, the separate stacks are assigned.
With this change, the SecPei and Smm instance can return unsupported which
gives caller a very clear status.
The old API InitializeCpuExceptionHandlersEx() is removed in this patch.
Because no platform module is consuming the old API, the impact is none.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Jian J Wang <jian.j.wang@intel.com>
InitializeCpuExceptionHandlers() expects caller allocates IDT while
InitializeCpuInterruptHandlers() allocates 256 IDT entries itself.
InitializeCpuExceptionHandlers() fills max 32 IDT entries allocated
by caller. If caller allocates 10 entries, the API just fills 10 IDT
entries.
The inconsistency between the two APIs makes code hard to
unerstand and hard to share.
Because there is only one caller (CpuDxe) for
InitializeCpuInterruptHandler(), this patch updates CpuDxe driver
to allocates 256 IDT entries then call
InitializeCpuExceptionHandlers().
This is also a backward compatible change.
With this change, InitializeCpuInterruptHandlers() is removed
completely.
And InitializeCpuExceptionHandlers() fills max 32 entries for PEI
and SMM instance, max 256 entries for DXE instance.
Such behavior matches to the original one.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Additionally removed two useless global variables:
"SPIN_LOCK mDisplayMessageSpinLock" from SMM instance.
"UINTN mEnabledInterruptNum" from DXE instance.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Today the DXE instance allocates code page and then copies the IDT
vectors to the allocated code page. Then it fixes up the vector number
in the IDT vector.
But if we update the NASM file to generate 256 IDT vectors, there is
no need to do the copy and fix-up.
A side effect is 4096 bytes (HOOKAFTER_STUB_SIZE * 256) is used for
256 IDT vectors while 32 IDT vectors only require 512 bytes without
this change, in following library instances:
1. 32bit SecPeiCpuExceptionHandlerLib and PeiCpuExceptionHandlerLib
2. 64bit PeiCpuExceptionHandlerLib
But considering the code logic simplification, 3.5K extra space is
not a big deal.
If 3.5K is too much, we can enhance the code further to generate 32
vectors for above mentioned library instances.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Jian J Wang <jian.j.wang@intel.com>
Acked-by: Eric Dong <eric.dong@intel.com>
There are two libraries: MdePkg/CpuLib and UefiCpuPkg/UefiCpuLib and
UefiCpuPkg/UefiCpuLib will be merged to MdePkg/CpuLib. To avoid build
failure, add CpuLib dependency to all modules that depend on UefiCpuLib.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Signed-off-by: Yu Pu <yu.pu@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
During AP bringup, just after switching to long mode, APs will do some
cpuid calls to verify that the extended topology leaf (0xB) is available
so they can fetch their x2 APIC IDs from it. In the case of SEV-ES,
these cpuid instructions must be handled by direct use of the GHCB MSR
protocol to fetch the values from the hypervisor, since a #VC handler
is not yet available due to the AP's stack not being set up yet.
For SEV-SNP, rather than relying on the GHCB MSR protocol, it is
expected that these values would be obtained from the SEV-SNP CPUID
table instead. The actual x2 APIC ID (and 8-bit APIC IDs) would still
be fetched from hypervisor using the GHCB MSR protocol however, so
introducing support for the SEV-SNP CPUID table in that part of the AP
bring-up code would only be to handle the checks/validation of the
extended topology leaf.
Rather than introducing all the added complexity needed to handle these
checks via the CPUID table, instead let the BSP do the check in advance,
since it can make use of the #VC handler to avoid the need to scan the
SNP CPUID table directly, and add a flag in ExchangeInfo to communicate
the result of this check to APs.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Min Xu <min.m.xu@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Ray Ni <ray.ni@intel.com>
Suggested-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3634
The memory allocated through "PeiAllocatePool" is located in HOB, and
in DXE phase, the HOB will be migrated to a different location.
After the migration, the data stored in the HOB stays the same, but the
address of pointer to the memory(such as the pointers in ACPI_CPU_DATA
structure) changes, which may cause "PiSmmCpuDxeSmm" driver can't find
the memory(the pointers in ACPI_CPU_DATA structure) that allocated in
"PeiRegisterCpuFeaturesLib", so use "PeiAllocatePages" to allocate
memory instead.
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
When enter SMM exception, there will be a stack switch only if the IST
field of the interrupt gate is set. When CET shadow stack feature is
enabled, if there is a stack switch between SMM exception and SMM, the
shadow stack token busy bit needs to be cleared when return from SMM
exception to SMM. In UEFI BIOS, only page fault exception does the stack
swith when SMM shack guard feature is enabled. The condition of clear
shadow stack token busy bit should be SMM stack guard enabled, CET shadows
stack feature enabled and page fault exception.
The shadow stack token should be initialized by UINT64.
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3462
Signed-off-by: Sheng Wei <w.sheng@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Qihua Zhuang <qihua.zhuang@intel.com>
Cc: Daquan Dong <daquan.dong@intel.com>
Cc: Justin Tong <justin.tong@intel.com>
Cc: Tom Xu <tom.xu@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3324
The SEV-ES stacks currently share a page with the reset code and data.
Separate the SEV-ES stacks from the reset vector code and data to avoid
possible stack overflows from overwriting the code and/or data.
When SEV-ES is enabled, invoke the GetWakeupBuffer() routine a second time
to allocate a new area, below the reset vector and data.
Both the PEI and DXE versions of GetWakeupBuffer() are changed so that
when PcdSevEsIsEnabled is true, they will track the previous reset buffer
allocation in order to ensure that the new buffer allocation is below the
previous allocation. When PcdSevEsIsEnabled is false, the original logic
is followed.
Fixes: 7b7508ad78
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Cc: Marvin Häuser <mhaeuser@posteo.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <3cae2ac836884b131725866264e0a0e1897052de.1621024125.git.thomas.lendacky@amd.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>