REF: https://bugzilla.tianocore.org/show_bug.cgi?id=3634
The memory allocated through "PeiAllocatePool" is located in HOB, and
in DXE phase, the HOB will be migrated to a different location.
After the migration, the data stored in the HOB stays the same, but the
address of pointer to the memory(such as the pointers in ACPI_CPU_DATA
structure) changes, which may cause "PiSmmCpuDxeSmm" driver can't find
the memory(the pointers in ACPI_CPU_DATA structure) that allocated in
"PeiRegisterCpuFeaturesLib", so use "PeiAllocatePages" to allocate
memory instead.
Signed-off-by: Jason Lou <yun.lou@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Rahul Kumar <rahul1.kumar@intel.com>
CpuInfo.First stores whether the current thread belongs to the first
package in the platform, first core in a package, first thread in a
core.
But the time complexity of original algorithm to calculate the
CpuInfo.First is O (n) * O (p) * O (c).
n: number of processors
p: number of packages
c: number of cores per package
The patch trades time with space by storing the first package, first
core per package, first thread per core in an array.
The time complexity becomes O (n).
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Star Zeng <star.zeng@intel.com>
Cc: Yun Lou <yun.lou@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
The required buffer size for InitOrder will be 96K when NumberOfCpus=1024.
sizeof (CPU_FEATURES_INIT_ORDER) = 96
NumberOfCpus = 1024 = 1K
sizeof (CPU_FEATURES_INIT_ORDER) * NumberOfCpus = 96K
AllocateZeroPool() will call to PeiServicesAllocatePool() which will use
EFI_HOB_MEMORY_POOL to management memory pool.
EFI_HOB_MEMORY_POOL.Header.HobLength is UINT16 type, so there is no way
for AllocateZeroPool() to allocate > 64K memory.
So AllocateZeroPool() could not be used anymore for the case above or
even bigger required buffer size.
This patch updates the code to use AllocatePages() instead of
AllocateZeroPool() to allocate buffer for InitOrder.
Signed-off-by: Star Zeng <star.zeng@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Today's code assumes every core contains the same number of threads.
It's not always TRUE for certain model.
Such assumption causes system hang when thread count per core
is different and there is core or package dependency between CPU
features (using CPU_FEATURE_CORE_BEFORE/AFTER,
CPU_FEATURE_PACKAGE_BEFORE/AFTER).
The change removes such assumption by calculating the actual thread
count per package and per core.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Cc: Yun Lou <yun.lou@intel.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Match data type and format specifier for printing.
1. Type cast ProcessorNumber and FeatureIndex to UINT32
as %d only expects a UINT32.
2. Use %08x instead of %08lx for CacheControl to print Index
as it is UINT32 type.
3. Use %016lx instead of %08lx for MemoryMapped to print
(Index | LShiftU64 (HighIndex, 32)) as it is UINT64 type.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Star Zeng <star.zeng@intel.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1584
The flow of CPU feature initialization logic is:
1. BSP calls GetConfigDataFunc() for each thread/AP;
2. Each thread/AP calls SupportFunc() to detect its own capability;
3. BSP calls InitializeFunc() for each thread/AP.
There is a design gap in step #3. For a package scope feature that only
requires one thread of each package does the initialization operation,
what InitializeFunc() currently does is to do the initialization
operation only CPU physical location Core# is 0.
But in certain platform, Core#0 might be disabled in hardware level
which results the certain package scope feature isn't initialized at
all.
The patch adds a new field First to indicate the CPU's location in
its parent scope.
First.Package is set for all APs/threads under first package;
First.Core is set for all APs/threads under first core of each
package;
First.Thread is set for the AP/thread of each core.
Signed-off-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Star Zeng <star.zeng@intel.com>
Cc: Michael D Kinney <michael.d.kinney@intel.com>
This debug message may be called by BSP and APs. It may
caused ASSERT when APs call this debug code.
In order to avoid system boot assert, Remove this debug
message.
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
REF: https://bugzilla.tianocore.org/show_bug.cgi?id=1972
AP calls CollectProcessorData() to collect processor info.
CollectProcessorData function finally calls PcdGetSize function to
get DynamicPCD PcdCpuFeaturesSetting value. PcdGetSize will use
PeiServices table which caused below assert info:
Processor Info: Package: 1, MaxCore : 4, MaxThread: 1
Package: 0, Valid Core : 4
ASSERT [CpuFeaturesPei] c:\projects\jsl\jsl_v1193\Edk2\MdePkg\Library
\PeiServicesTablePointerLibIdt\PeiServicesTablePointer.c(48):
PeiServices != ((void *) 0)
This change uses saved global pcd size instead of calls PcdGetSize to
fix this issue.
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Chandana Kumar <chandana.c.kumar@intel.com>
Cc: Star Zeng <star.zeng@intel.com>
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
Reviewed-by: Star Zeng <star.zeng@intel.com>
PcdCpuFeaturesSupport used to specify the platform policy about
what CPU features this platform supports. This PCD will be used
in IsCpuFeatureSupported only.
Now RegisterCpuFeaturesLib use this PCD as an template to Get the
pcd size. Update the code logic to replace it with
PcdCpuFeaturesSetting.
BZ:https://bugzilla.tianocore.org/show_bug.cgi?id=1375
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
PcdCpuFeaturesUserConfiguration.
Merge PcdCpuFeaturesUserConfiguration into PcdCpuFeaturesSetting.
Use PcdCpuFeaturesSetting as input for the user input feature setting
Use PcdCpuFeaturesSetting as output for the final CPU feature setting
BZ:https://bugzilla.tianocore.org/show_bug.cgi?id=1375
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
In AcquireSpinLock function, it may call GetPerformanceCounter which
final calls PeiService table. This code may also been used by AP but
AP should not calls PeiService. This patch update code to avoid use
AcquireSpinLock function.
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1411
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
V3:
Define union to specify the ppi or protocol.
V2:
1. Initialize CpuFeaturesData->MpService in CpuInitDataInitialize
and make this function been called at the begin of the
initialization.
2. let all other functions use CpuFeaturesData->MpService install
of locate the protocol itself.
V1:
GetProcessorIndex function calls GetMpPpi to get the MP Ppi.
Ap will calls GetProcessorIndex function which final let AP calls
PeiService.
This patch avoid GetProcessorIndex call PeiService.
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1411
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ray Ni <ray.ni@intel.com>
In current implementation, core and package level sync uses same semaphores.
Sharing the semaphore may cause wrong execution order.
For example:
1. Feature A has CPU_FEATURE_CORE_BEFORE dependency with Feature B.
2. Feature C has CPU_FEATURE_PACKAGE_AFTER dependency with Feature B.
The expected feature initialization order is A B C:
A ---- (Core Depends) ----> B ---- (Package Depends) ----> C
For a CPU has 1 package, 2 cores and 4 threads. The feature initialization
order may like below:
Thread#1 Thread#2 Thread#3 Thread#4
[A.Init] [A.Init] [A.Init]
Release(S1, S2) Release(S1, S2) Release(S3, S4)
Wait(S1) * 2 Wait(S2) * 2 <------------------------------- Core sync
[B.Init] [B.Init]
Release (S1,S2,S3,S4)
Wait (S1) * 4 <----------------------------------------------------- Package sync
Wait(S4 * 2) <- Core sync
[B.Init]
In above case, for thread#4, when it syncs in core level, Wait(S4) * 2 isn't
blocked and [B.Init] runs. But [A.Init] hasn't run in thread#3. It's wrong!
Thread#4 should execute [B.Init] after thread#3 executes [A.Init] because B
core level depends on A.
The reason of the wrong execution order is that S4 is released in thread#1
by calling Release (S1, S2, S3, S4) and in thread #4 by calling
Release (S3, S4).
To fix this issue, core level sync and package level sync should use separate
semaphores.
In above example, the S4 released in Release (S1, S2, S3, S4) should not be the
same semaphore as that in Release (S3, S4).
Related BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1311
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
V2 changes:
V1 change has regression which caused by change feature order.
V2 changes logic to detect dependence not only for the
neighborhood features. It need to check all features in the list.
V1 Changes:
In current code logic, only adjust feature position if current
CPU feature position not follow the request order. Just like
Feature A need to be executed before feature B, but current
feature A registers after feature B. So code will adjust the
position for feature A, move it to just before feature B. If
the position already met the requirement, code will not adjust
the position.
This logic has issue when met all below cases:
1. feature A has core or package level dependence with feature B.
2. feature A is register before feature B.
3. Also exist other features exist between feature A and B.
Root cause is driver ignores the dependence for this case, so
threads may execute not follow the dependence order.
Fix this issue by change code logic to adjust feature position
for CPU features which has dependence relationship.
Related BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=1311
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <Ruiyu.ni@intel.com>
Build UefiCpuPkg with below configuration:
Architecture(s) = IA32
Build target = NOOPT
Toolchain = VS2015x86
Below error info shows up:
DxeRegisterCpuFeaturesLib.lib(CpuFeaturesInitialize.obj) :
error LNK2001: unresolved external symbol __allmul
Valid mDependTypeStr type only have 5 items, use UINT32 type cast
to fix this error.
Cc: Dandan Bi <dandan.bi@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
Current code assume only one dependence (before or after) for one
feature. Enhance code logic to support feature has two dependence
(before and after) type.
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
Code initialized in function can't be correctly detected by build tool.
Add code to clearly initialize the local variable before use it.
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Dandan Bi <dandan.bi@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
V4 changes include:
1. Serial debug message for different threads when program the register table.
V3 changes include:
1. Use global variable instead of internal function to return string for register type
and dependence type.
2. Add comments for some complicated logic.
V2 changes include:
1. Add more description for the code part which need easy to understand.
2. Refine some code base on feedback for V1 changes.
V1 changes include:
In a system which has multiple cores, current set register value task costs huge times.
After investigation, current set MSR task costs most of the times. Current logic uses
SpinLock to let set MSR task as an single thread task for all cores. Because MSR has
scope attribute which may cause GP fault if multiple APs set MSR at the same time,
current logic use an easiest solution (use SpinLock) to avoid this issue, but it will
cost huge times.
In order to fix this performance issue, new solution will set MSRs base on their scope
attribute. After this, the SpinLock will not needed. Without SpinLock, new issue raised
which is caused by MSR dependence. For example, MSR A depends on MSR B which means MSR A
must been set after MSR B has been set. Also MSR B is package scope level and MSR A is
thread scope level. If system has multiple threads, Thread 1 needs to set the thread level
MSRs and thread 2 needs to set thread and package level MSRs. Set MSRs task for thread 1
and thread 2 like below:
Thread 1 Thread 2
MSR B N Y
MSR A Y Y
If driver don't control execute MSR order, for thread 1, it will execute MSR A first, but
at this time, MSR B not been executed yet by thread 2. system may trig exception at this
time.
In order to fix the above issue, driver introduces semaphore logic to control the MSR
execute sequence. For the above case, a semaphore will be add between MSR A and B for
all threads. Semaphore has scope info for it. The possible scope value is core or package.
For each thread, when it meets a semaphore during it set registers, it will 1) release
semaphore (+1) for each threads in this core or package(based on the scope info for this
semaphore) 2) acquire semaphore (-1) for all the threads in this core or package(based
on the scope info for this semaphore). With these two steps, driver can control MSR
sequence. Sample code logic like below:
//
// First increase semaphore count by 1 for processors in this package.
//
for (ProcessorIndex = 0; ProcessorIndex < PackageThreadsCount ; ProcessorIndex ++) {
LibReleaseSemaphore ((UINT32 *) &SemaphorePtr[PackageOffset + ProcessorIndex]);
}
//
// Second, check whether the count has reach the check number.
//
for (ProcessorIndex = 0; ProcessorIndex < ValidApCount; ProcessorIndex ++) {
LibWaitForSemaphore (&SemaphorePtr[ApOffset]);
}
Platform Requirement:
1. This change requires register MSR setting base on MSR scope info. If still register MSR
for all threads, exception may raised.
Known limitation:
1. Current CpuFeatures driver supports DXE instance and PEI instance. But semaphore logic
requires Aps execute in async mode which is not supported by PEI driver. So CpuFeature
PEI instance not works after this change. We plan to support async mode for PEI in phase
2 for this task.
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
1. Do not use tab characters
2. No trailing white space in one line
3. All files must end with CRLF
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Liming Gao <liming.gao@intel.com>
When CpuCommonFeaturesLib use RegisterCpuFeaturesLib to register
CPU features, the CpuFeaturesData->BitMaskSize has already been
initialized. So delete redundant PcdGetSize PcdCpuFeaturesSupport
in CpuInitDataInitialize.
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Bell Song <binx.song@intel.com>
Reviewed-by: Eric Dong <eric.dong@intel.com>
Current code allocate buffer for the pointer which later get value
from PCD database. but current code error use "=" for this case.
Use AllocateCopyPool instead to fix it.
V2 enhanced to directly use AllocateCopyPool to get the PCD value.
V3 enhanced to avoid using local temp variable.
V4 enhanced to keep the functions to get the pcd values.
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Shao Ming <ming.shao@intel.com>
Cc: Kinney Michael D <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Kinney Michael D <michael.d.kinney@intel.com>
Current debug message when enable/disable CPU feature not
correct. This patch enhances it to make it more accurate.
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Shao, Ming <ming.shao@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Ruiyu Ni <ruiyu.ni@intel.com>
PcdGetSize() returns UINTN data type. The consumer code should use UINTN data
to get its size.
This issue is found when PcdCpuFeaturesSupport is configured as patchable.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Liming Gao <liming.gao@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Disable CPU feature may return error, add error handling
code to handle it instead of assert it.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
Add error handling code when initialize the CPU feature failed.
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Eric Dong <eric.dong@intel.com>
Reviewed-by: Jeff Fan <jeff.fan@intel.com>
The current CPU_REGISTER_TABLE_ENTRY structure only defined UINT32 Index to
indicate MSR/MMIO address. It's ok for MSR because MSR address is UINT32 type
actually. But for MMIO address, UINT32 limits MMIO address exceeds 4GB.
This update on CPU_REGISTER_TABLE_ENTRY is to add additional UINT32 field
HighIndex to indicate the high 32bit MMIO address and original Index still
indicate the low 32bit MMIO address.
This update makes use of original padding space between ValidBitLength and
Value to add HighIndex.
Cc: Feng Tian <feng.tian@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>
PEI Register CPU Features Library instance is used to register/manager/program
CPU features on PEI phase.
DXE Register CPU Features Library instance is used to register/manager/program
CPU features on DXE phase.
v2:
Format debug messages.
v3:
Trim white space at end of line.
v4:
Remove unused local variable.
Cc: Feng Tian <feng.tian@intel.com>
Cc: Michael Kinney <michael.d.kinney@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Jeff Fan <jeff.fan@intel.com>
Reviewed-by: Feng Tian <feng.tian@intel.com>