UEFI requires us to support nested interrupts, but provides no way for
an interrupt handler to call RestoreTPL() without implicitly
re-enabling interrupts. In a virtual machine, it is possible for a
large burst of interrupts to arrive. We must prevent such a burst
from leading to stack underrun, while continuing to allow nested
interrupts to occur.
This can be achieved by allowing, when provably safe to do so, an
inner interrupt handler to return from the interrupt without restoring
the TPL and with interrupts remaining disabled after IRET, with the
deferred call to RestoreTPL() then being issued from the outer
interrupt handler. This is necessarily messy and involves direct
manipulation of the interrupt stack frame, and so should not be
implemented as open-coded logic within each interrupt handler.
Add the Nested Interrupt TPL Library (NestedInterruptTplLib) to
provide helper functions that can be used by nested interrupt handlers
in place of RaiseTPL()/RestoreTPL().
Example call tree for a timer interrupt occurring at TPL_APPLICATION
with a nested timer interrupt that makes its own call to RestoreTPL():
outer TimerInterruptHandler()
InterruptedTPL == TPL_APPLICATION
...
IsrState->InProgressRestoreTPL = TPL_APPLICATION;
gBS->RestoreTPL (TPL_APPLICATION);
EnableInterrupts();
dispatch a TPL_CALLBACK event
gEfiCurrentTpl = TPL_CALLBACK;
nested timer interrupt occurs
inner TimerInterruptHandler()
InterruptedTPL == TPL_CALLBACK
...
IsrState->InProgressRestoreTPL = TPL_CALLBACK;
gBS->RestoreTPL (TPL_CALLBACK);
EnableInterrupts();
DisableInterrupts();
IsrState->InProgressRestoreTPL = TPL_APPLICATION;
IRET re-enables interrupts
... finish dispatching TPL_CALLBACK events ...
gEfiCurrentTpl = TPL_APPLICATION;
DisableInterrupts();
IsrState->InProgressRestoreTPL = 0;
sees IsrState->DeferredRestoreTPL == FALSE and returns
IRET re-enables interrupts
Example call tree for a timer interrupt occurring at TPL_APPLICATION
with a nested timer interrupt that defers its call to RestoreTPL() to
the outer instance of the interrupt handler:
outer TimerInterruptHandler()
InterruptedTPL == TPL_APPLICATION
...
IsrState->InProgressRestoreTPL = TPL_APPLICATION;
gBS->RestoreTPL (TPL_APPLICATION);
EnableInterrupts();
dispatch a TPL_CALLBACK event
... finish dispatching TPL_CALLBACK events ...
gEfiCurrentTpl = TPL_APPLICATION;
nested timer interrupt occurs
inner TimerInterruptHandler()
InterruptedTPL == TPL_APPLICATION;
...
sees InterruptedTPL == IsrState->InProgressRestoreTPL
IsrState->DeferredRestoreTPL = TRUE;
DisableInterruptsOnIret();
IRET returns without re-enabling interrupts
DisableInterrupts();
IsrState->InProgressRestoreTPL = 0;
sees IsrState->DeferredRestoreTPL == TRUE and loops
IsrState->InProgressRestoreTPL = TPL_APPLICATION;
gBS->RestoreTPL (TPL_APPLICATION); <-- deferred call
EnableInterrupts();
DisableInterrupts();
IsrState->InProgressRestoreTPL = 0;
sees IsrState->DeferredRestoreTPL == FALSE and returns
IRET re-enables interrupts
Cc: Paolo Bonzini <pbonzini@redhat.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=4162
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Deferring the EOI until after the call to RestoreTPL() means that any
callbacks invoked by RestoreTPL() will run with timer interrupt
delivery disabled. If any such callbacks themselves rely on timers to
implement timeout loops, then the callbacks will get stuck in an
infinite loop from which the system will never recover.
This reverts commit 239b50a86 ("OvmfPkg: End timer interrupt later to
avoid stack overflow under load").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=4162
Signed-off-by: Michael Brown <mcb30@ipxe.org>
Acked-by: Laszlo Ersek <lersek@redhat.com>
qemu uses the etc/e820 fw_cfg file not only for memory, but
also for reservations. Handle reservations by adding resource
descriptor hobs for them.
A typical qemu configuration has a small reservation between
lapic and flash:
# sudo cat /proc/iomem
[ ... ]
fee00000-fee00fff : Local APIC
feffc000-feffffff : Reserved <= HERE
ffc00000-ffffffff : Reserved
[ ... ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The Hii form is named "MainFormState" and the EFI variable is named
"PlatformConfig". Take into account the different names.
Fixes: aefcc91805 ("OvmfPkg/PlatformDxe: Handle all requests in ExtractConfig and RouteConfig")
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
BDS module was moved from DXEFV to newly created BDSFV recently.
Non-universal UEFI payload doesn't support multiple FV, so it failed
to boot since BDS module could not be found.
This patch add BDS back to DXEFV when UNIVERSAL_PAYLOAD is not set.
Cc: Ray Ni <ray.ni@intel.com>
Cc: Sean Rhodes <sean@starlabs.systems>
Cc: James Lu <james.lu@intel.com>
Cc: Gua Guo <gua.guo@intel.com>
Signed-off-by: Guo Dong <guo.dong@intel.com>
Reviewed-by: James Lu <james.lu@intel.com>
Reviewed-by: Gua Guo <gua.guo@intel.com>
PcdConfidentialComputingGuestAttr can be used to check the cc guest
type, including td-guest or sev-guest. CcProbe() can do the same
thing but CcProbeLib should be included in the dsc which uses
AcpiPlatformDxe. The difference between PcdConfidentialComputingGuestAttr
and CcProbe() is that PcdConfidentialComputingGuestAttr cannot be used
in multi-processor scenario but CcProbe() can. But there is no such
issue in AcpiPlatformDxe.
So we use PcdConfidentialComputingGuestAttr instead of CcProbeLib so that
it is simpler.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Fixes problems due to code assuming it runs with frame pointers and thus
updates rbp / ebp registers when switching stacks.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Liming Gao <gaoliming@byosoft.com.cn>
Simplify the code to set memory used by smm page table as RO.
Since memory used by smm page table are in PageTablePool list,
we only need to set all PageTablePool as ReadOnly in smm page
table itself. Also, we only need to flush tlb once after
setting all page table pool as Read Only.
Signed-off-by: Dun Tan <dun.tan@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Remove SmmCpuFeaturesAllocatePageTableMemory in this headfile.
This API is not used by PiSmmCpuDxeSmm driver any more. Also
no other files use this API.
Signed-off-by: Dun Tan <dun.tan@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Introduce page table pool mechanism for smm page table to simplify
page table memory management and protection. This mechanism has been
used in DxeIpl. The basic idea is to allocate a bunch of continuous
pages of memory in advance, and all future page tables consumption
will happen in those pool instead of system memory.
Since we have centralized page tables, we only need to mark all page
table pools as RO, instead of searching page table memory layer by
layer in smm page table. Once current page table pool has been used
up, another memory pool will be allocated and the new pool will also
be set as RO if current page table memory has been marked as RO.
Signed-off-by: Dun Tan <dun.tan@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
Copy the function BuildPlatformInfoHob() from OvmfPkg/PlatformPei.
QemuFwCfgLib expect this HOB to be present, or fails to do anything.
InternalQemuFwCfgIsAvailable() from QemuFwCfgPeiLib module will not
check if the HOB is actually present for example and try to use a NULL
pointer.
Fixes: cda98df162 ("OvmfPkg/QemuFwCfgLib: remove mQemuFwCfgSupported + mQemuFwCfgDmaSupported")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Jiewen Yao <jiewen.yao@intel.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=4172
TDVF once accepts memory only by BSP. To improve the boot performance
this patch introduce the multi-core accpet memory. Multi-core means
BSP and APs work together to accept memory.
TDVF leverages mailbox to wake up APs. It is not enabled in MpInitLib
(Which requires SIPI). So multi-core accept memory cannot leverages
MpInitLib to coordinate BSP and APs to work together.
So TDVF split the accept memory into 2 phases.
- AcceptMemoryForAPsStack:
BSP accepts a small piece of memory which is then used by APs to setup
stack. We assign a 16KB stack for each AP. So a td-guest with 256 vCPU
requires 255*16KB = 4080KB.
- AcceptMemory:
After above small piece of memory is accepted, BSP commands APs to
accept memory by sending AcceptPages command in td-mailbox. Together
with the command and accpet-function, the APsStack address is send
as well. APs then set the stack and jump to accept-function to accept
memory.
AcceptMemoryForAPsStack accepts as small memory as possible and then jump
to AcceptMemory. It fully takes advantage of BSP/APs to work together.
After accept memory is done, the memory region for APsStack is not used
anymore. It can be used as other private memory. Because accept-memory
is in the very beginning of boot process and it will not impact other
phases.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=4172
TdxMailboxLib is designed only for TDX guest which arch is X64. This
patch set the VALID_ARCHITECTURES of TdxMailboxLib as X64.
Because in the following patches TdxMailboxLib will be included in
PlatformInitLib. While PlatformInitLib is imported by some X64 platforms
(for example AmdSevX64.dsc). So we need a NULL instance of TdxMailboxLib
which VALID_ARCHITECTURES is X64 as well. Based on this consideration
we design TdxMailboxLibNull.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
- Fix typos of "disable".
- Fix typos of "performance".
- Fix missing spaces.
- Use comma instead of period when the sentence continues on the next
line.
- Fix typo of "PERF_CORE_LOAD_IMAGE".
Signed-off-by: Rebecca Cran <rebecca@quicinc.com>
Reviewed-by: Liming Gao <gaoliming@byosoft.com.cn>
Fix typo of EFI_INVALID_PARAMETER in Protocol/UsbIo.h by adding a
missing 'R'.
Signed-off-by: Rebecca Cran <rebecca@quicinc.com>
Reviewed-by: Liming Gao <gaoliming@byosoft.com.cn>
On some platforms, including Sky Lake and Kaby Lake, the PSIV (Protocol
Speed ID Value) indices are shared between Protocol Speed ID DWORD' in
the extended capabilities registers for both USB2 (Full Speed) and USB3
(Super Speed).
An example can be found below:
XhcCheckUsbPortSpeedUsedPsic: checking for USB2 ext caps
XhciPsivGetPsid: found 3 PSID entries
XhciPsivGetPsid: looking for port speed 1
XhciPsivGetPsid: PSIV 1 PSIE 2 PLT 0 PSIM 12
XhciPsivGetPsid: PSIV 2 PSIE 1 PLT 0 PSIM 1500
XhciPsivGetPsid: PSIV 3 PSIE 2 PLT 0 PSIM 480
XhcCheckUsbPortSpeedUsedPsic: checking for USB3 ext caps
XhciPsivGetPsid: found 3 PSID entries
XhciPsivGetPsid: looking for port speed 1
XhciPsivGetPsid: PSIV 1 PSIE 3 PLT 0 PSIM 5
XhciPsivGetPsid: PSIV 2 PSIE 3 PLT 0 PSIM 10
XhciPsivGetPsid: PSIV 34 PSIE 2 PLT 0 PSIM 1248
The result is edk2 detecting USB2 devices as USB3 devices, which
consequently causes enumeration to fail.
To avoid incorrect detection, check the Compatible Port Offset to find
the starting Port of Root Hubs that support the protocol.
Signed-off-by: Sean Rhodes <sean@starlabs.systems>
Reviewed-by: Hao A Wu <hao.a.wu@intel.com>
PSID matching relies on comparing the PSIV against the PortSpeed
value. This patch stops edk2 from checking for a PSIV of 0, as it
is not valid; this reduces the number of register access by
approximately 6 per second.
Cc: Hao A Wu <hao.a.wu@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Signed-off-by: Matt DeVillier <matt.devillier@gmail.com>
Reviewed-by: Sean Rhodes <sean@starlabs.systems>
Reviewed-by: Hao A Wu <hao.a.wu@intel.com>
During the finalization of Mp initialization before booting into the OS,
depending on whether Mwait is supported or not, AsmRelocateApLoop
places Aps in MWAIT-loop or HLT-loop.
Since paging is necessary for long mode, the original implementation of
moving APs to 32-bit was to disable paging to ensure that the booting
does not crash.
The current modification creates a page table in reserved memory,
avoiding switching modes and reclaiming memory by OS. This modification
is only for 64 bit mode.
More specifically, we keep the AMD logic as the original code flow,
extract and update the Intel-related code, where the APs would stay
in 64-bit, and run in a Mwait or Hlt loop until the OS wake them up.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Signed-off-by: Yuanhao Xie <yuanhao.xie@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
AsmRelocateApLoop is replicated for future Intel Logic Extraction,
further brings AP into 64-bit, and enables paging.
Signed-off-by: Yuanhao Xie <yuanhao.xie@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Add condition to return success if mUartCount is greater
than zero in SerialPortInitialize() to avoid filling mUartInfo
with the same hob data when SerialPortInitialize() is called
multiple times. Also add proper conditions in SerialPortRead
function to read the data properly from multiple UART's.
Cc: Guo Dong <guo.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: James Lu <james.lu@intel.com>
Reviewed-by: Gua Guo <gua.guo@intel.com>
Signed-off-by: Kavya <k.kavyax.sravanthi@intel.com>
For some use cases, Redfish host interface table relies on
the certain EFI protocols installation at the driver connection.
Redfish host interface DXE driver is not able to build the
SMBIOS type 42h record at driver entry point. This patch adds
the mechanism in Redfish host interface DXE driver to listen
to EFI protocol installed by platform library that indicates
the necessary information is ready for building SMBIOS 42h
record.
Signed-off-by: Abner Chang <abner.chang@amd.com>
Cc: Nickle Wang <nicklew@nvidia.com>
Cc: Igor Kulchytskyy <igork@ami.com>
Reviewed-by: Nickle Wang <nicklew@nvidia.com>
In the commit 4f173db8b4 "OvmfPkg/PlatformInitLib: Add functions for
EmuVariableNvStore", it introduced a PlatformValidateNvVarStore() function
for checking the integrity of NvVarStore.
In some cases when the VariableHeader->StartId is VARIABLE_DATA, the
VariableHeader->State is not just one of the four primary states:
VAR_IN_DELETED_TRANSITION, VAR_DELETED, VAR_HEADER_VALID_ONLY, VAR_ADDED.
The state may combined two or three states, e.g.
0x3C = (VAR_IN_DELETED_TRANSITION & VAR_ADDED) & VAR_DELETED
or
0x3D = VAR_ADDED & VAR_DELETED
When the variable store has those variables, system booting/rebooting will
hangs in a ASSERT:
NvVarStore Variable header State was invalid.
ASSERT
/mnt/working/source_code-git/edk2/OvmfPkg/Library/PlatformInitLib/Platform.c(819):
((BOOLEAN)(0==1))
Adding more log to UpdateVariable() and PlatformValidateNvVarStore(), we
saw some variables which have 0x3C or 0x3D state in store.
e.g.
UpdateVariable(), VariableName=BootOrder
L1871, State=0000003F <-- VAR_ADDED
State &= VAR_DELETED=0000003D
FlushHobVariableToFlash(), VariableName=BootOrder
...
UpdateVariable(), VariableName=InitialAttemptOrder
L1977, State=0000003F
State &= VAR_IN_DELETED_TRANSITION=0000003E
L2376, State=0000003E
State &= VAR_DELETED=0000003C
FlushHobVariableToFlash(), VariableName=InitialAttemptOrder
...
UpdateVariable(), VariableName=ConIn
L1977, State=0000003F
State &= VAR_IN_DELETED_TRANSITION=0000003E
L2376, State=0000003E
State &= VAR_DELETED=0000003C
FlushHobVariableToFlash(), VariableName=ConIn
...
So, only allowing the four primary states is not enough. This patch changes
the falid states list (Follow Jiewen Yao's suggestion):
1. VAR_HEADER_VALID_ONLY (0x7F)
- Header added (*)
2. VAR_ADDED (0x3F)
- Header + data added
3. VAR_ADDED & VAR_IN_DELETED_TRANSITION (0x3E)
- marked as deleted, but still valid, before new data is added. (*)
4. VAR_ADDED & VAR_IN_DELETED_TRANSITION & VAR_DELETED (0x3C)
- deleted, after new data is added.
5. VAR_ADDED & VAR_DELETED (0x3D)
- deleted directly, without new data.
(*) means to support surprise shutdown.
And removed (VAR_IN_DELETED_TRANSITION) and (VAR_DELETED) because they are
invalid states.
v2:
Follow Jiewen Yao's suggestion to add the following valid states:
VAR_ADDED & VAR_DELETED (0x3D)
VAR_ADDED & VAR_IN_DELETED_TRANSITION (0x3E)
VAR_ADDED & VAR_IN_DELETED_TRANSITION & VAR_DELETED (0x3C)
and removed the following invalid states:
VAR_IN_DELETED_TRANSITION
VAR_DELETED
Signed-off-by: Chun-Yi Lee <jlee@suse.com>
Reviewed-by: Jiewen Yao <jiewen.yao@intel.com>
The following PCDs have no value in UefiPayloadPkg.dsc
and they can not pass the Ecc tool check, so assign
the default values the same as they are in *.dec file.
1. gEfiMdeModulePkgTokenSpaceGuid.PcdAriSupport
2. gEfiMdeModulePkgTokenSpaceGuid.PcdMrIovSupport
3. gEfiMdeModulePkgTokenSpaceGuid.PcdSrIovSuppor
4. gEfiMdeModulePkgTokenSpaceGuid.PcdSrIovSystemPageSize
5. gUefiCpuPkgTokenSpaceGuid.PcdCpuApInitTimeOutInMicroSeconds
6. gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode
7. gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchAddress
8. gUefiCpuPkgTokenSpaceGuid.PcdCpuMicrocodePatchRegionSize
Reviewed-by: Gua Guo <gua.guo@intel.com>
Signed-off-by: jdzhang <jdzhang@kunluntech.com.cn>
Allow object to specify the name of processor and processor container
nodes and the UID of processor containers.
This allows these to be more accurately referenced from other tables.
For example for the _PSL method or the UID in the APMT table.
The UID and Name for processor container may be different as if the
intention is to set names as the corresponding affinity level the UID
may need to be different if there are multiple levels of containers.
Signed-off-by: Jeff Brasen <jbrasen@nvidia.com>
Reviewed-by: Sami Mujawar <sami.mujawar@arm.com>
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=4171
A typical QEMU fw_cfg read bytes with IOMMU for td guest is that:
(QemuFwCfgReadBytes@QemuFwCfgLib.c is the example)
1) Allocate DMA Access buffer
2) Map actual data buffer
3) start the transfer and wait for the transfer to complete
4) Free DMA Access buffer
5) Un-map actual data buffer
In step 1/2, Private memories are allocated, converted to shared memories.
In Step 4/5 the shared memories are converted to private memories and
accepted again. The final step is to free the pages.
This is time-consuming and impacts td guest's boot perf (both direct boot
and grub boot) badly.
In a typical grub boot, there are about 5000 calls of page allocation and
private/share conversion. Most of page size is less than 32KB.
This patch allocates a memory region and initializes it into pieces of
memory with different sizes. A piece of such memory consists of 2 parts:
the first page is of private memory, and the other pages are shared
memory. This is to meet the layout of common buffer.
When allocating bounce buffer in IoMmuMap(), IoMmuAllocateBounceBuffer()
is called to allocate the buffer. Accordingly when freeing bounce buffer
in IoMmuUnmapWorker(), IoMmuFreeBounceBuffer() is called to free the
bounce buffer. CommonBuffer is allocated by IoMmuAllocateCommonBuffer
and accordingly freed by IoMmuFreeCommonBuffer.
This feature is tested in Intel TDX pre-production platform. It saves up
to hundreds of ms in a grub boot.
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jiewen Yao <Jiewen.yao@intel.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>