Parallelly run the function to SeparateExceptionStacks for all CPUs and allocate buffers together for better performance. Cc: Eric Dong <eric.dong@intel.com> Reviewed-by: Ray Ni <ray.ni@intel.com> Cc: Rahul Kumar <rahul1.kumar@intel.com> Signed-off-by: Zhiguang Liu <zhiguang.liu@intel.com>