However, this also brings memory access competitions. It is a good thing that more and more CPU cores are provided to users in one socket, because this brings more computation resources. It contains two sockets for Intel® Xeon® processor Scalable family CPUs. Meanwhile, chips communicates through the Intel UPI interconnect, which features a transfer speed of up to 10.4 GT/s.įigure 2: Typical two-socket configuration.įigure 3: An ASUS Z11PA-D8 Intel® Xeon® server motherboard. 6 channels of DDR4 memory are connected to the chip directly. Each CPU chip consists of a number of cores, along with core-specific cache. Figure 1 shows microarchitecture of the Intel® Xeon® processor Scalable family chips. The CPU features fast 2666 MHz DDR4 memory, six memory channels per CPU, Intel Ultra Path Interconnect (UPI) high speed point-to-point processor interconnect, and more. Each core has a non-inclusive last-level cache and an 1MB L2 cache. On the Intel® Xeon® Scalable Processors with Intel® C620 Series Chipsets, (formerly Purley) platform, each chip provides up to 28 cores. Understanding this background knowledge is helpful to understand the PyTorch optimization methodologies that Intel engineers recommend. We’ll use Intel® Xeon® processor Scalable family as an example to discuss an Intel CPU and how it works. This section briefly introduces the structure of Intel CPUs, as well as concept of Non-Uniform Memory Access (NUMA). This article introduces common methods recommended by Intel developers. Most optimized configurations can be automatically set by the launcher script. It is developed and optimized for Intel Architecture Processors, Intel Processor Graphics, and Xe architecture-based Graphics.Īlthough default primitives of PyTorch and Intel® Extension for PyTorch* are highly optimized, there are things users can do improve performance. To fully utilize the power of Intel® architecture and thus yield high performance, PyTorch, as well as Intel® Extension for PyTorch*, are powered by oneAPI Deep Neural Network Library (oneDNN), an open-source cross-platform performance library of basic building blocks for deep learning applications. It makes the out-of-box user experience of PyTorch CPU better while achieving good performance. Intel® Extension for PyTorch* is a Python package to extend official PyTorch.
0 Comments
Leave a Reply. |