图形学书籍 Real-Time Rendering 第三章 The Graphics Processing Unit(GPU) (根据谷歌翻译修改)
图形学书籍 Real-Time Rendering 第三章 The Graphics Processing Unit(GPU) (根据谷歌翻译修改)
Chapter 3 The Graphics Processing Unit
Historically, graphics acceleration started with interpolating colors on each pixel scanline overlapping a triangle and then displaying these values. Including the ability to access image data allowed textures to be applied to surfaces. Adding hardware for interpolating and testing z-depths provided built-in visibility checking. Because of their frequent use, such processes were committed to dedicated hardware to increase performance. More parts of the rendering pipeline, and much more functionality for each, were added in successive generations. Dedicated graphics hardware’s only computational advantage over the CPU is speed, but speed is critical.
从历史上看,图形加速开始于在与三角形重叠的每个像素扫描线上插值颜色,然后显示这些值。 包括访问图像数据的能力允许将纹理应用于表面。 添加用于插值和测试 z 深度的硬件提供了内置的可见性检查。 由于它们的频繁使用,此类进程致力于专用硬件以提高性能。 渲染管道的更多部分以及每个部分的更多功能在连续几代中被添加。 专用图形硬件相对于 CPU 的唯一计算优势是速度,但速度至关重要。
Over the past two decades, graphics hardware has undergone an incredible transformation. The first consumer graphics chip to include hardware vertex processing (NVIDIA’s GeForce256) shipped in 1999. NVIDIA coined the term graphics processing unit (GPU) to differentiate the GeForce 256 from the previously available rasterizationonly chips, and it stuck. During the next few years, the GPU evolved from configurable implementations of a complex fixed-function pipeline to highly programmable blank slates where developers could implement their own algorithms. Programmable shaders of various kinds are the primary means by which the GPU is controlled. For efficiency, some parts of the pipeline remain configurable, not programmable, but the trend is toward programmability and flexibility [175].
在过去的二十年中,图形硬件经历了令人难以置信的转变。 第一个包含硬件顶点处理的消费类图形芯片(NVIDIA 的 GeForce256)于 1999 年面世。NVIDIA 创造了图形处理单元 (GPU) 一词以将 GeForce 256 与以前可用的仅光栅化芯片区分开来,并且它一直流行。 在接下来的几年里,GPU 从复杂的固定功能管道的可配置实现演变为高度可编程的白板,开发人员可以在其中实现自己的算法。 各种可编程着色器是控制 GPU 的主要方式。 为了提高效率,管道的某些部分仍然是可配置的,而不是可编程的,但趋势是朝着可编程性和灵活性发展 [175]。
GPUs gain their great speed from a focus on a narrow set of highly parallelizable tasks. They have custom silicon dedicated to implementing the z-buffer, to rapidly accessing texture images and other buffers, and to finding which pixels are covered by a triangle, for example. How these elements perform their functions is covered in Chapter 23. More important to know early on is how the GPU achieves parallelism for its programmable shaders.
GPU 通过专注于一小部分高度可并行化的任务来获得极快的速度。 例如,他们拥有专门用于实现 z 缓冲区、快速访问纹理图像和其他缓冲区以及查找哪些像素被三角形覆盖的定制芯片。 第 23 章介绍了这些元素如何执行其功能。更重要的是尽早了解 GPU 如何为其可编程着色器实现并行性。
Section 3.3 explains how shaders function. For now, what you need to know is that a shader core is a small processor that does some relatively isolated task, such as transforming a vertex from its location in the world to a screen coordinate, or computing the color of a pixel covered by a triangle. With thousands or millions of triangles being sent to the screen each frame, every second there can be billions of shader invocations, that is, separate instances where shader programs are run.
3.3 节解释了着色器是如何工作的。 现在,您需要知道的是着色器核心是一个小型处理器,它执行一些相对独立的任务,例如将顶点从其在世界中的位置转换为屏幕坐标,或者计算被三角形覆盖的像素的颜色 。 每帧都有数千或数百万个三角形被发送到屏幕,每秒可能有数十亿次着色器调用,即运行着色器程序的单独实例。
To begin with, latency is a concern that all processors face. Accessing data takes some amount of time. A basic way to think about latency is that the farther away the information is from the processor, the longer the wait. Section 23.3 covers latency in more detail. Information stored in memory chips will take longer to access than that in local registers. Section 18.4.1 discusses memory access in more depth. The key point is that waiting for data to be retrieved means the processor is stalled, which reduces performance.
首先,延迟是所有处理器都面临的一个问题。 访问数据需要一些时间。 考虑延迟的一个基本方法是信息离处理器越远,等待的时间就越长。 第 23.3 节更详细地介绍了延迟。 存储在存储芯片中的信息比本地寄存器中的信息需要更长的时间来访问。 18.4.1 节更深入地讨论了内存访问。 关键是等待数据被检索意味着处理器停滞,这会降低性能。
昇腾计算产业是基于昇腾系列(HUAWEI Ascend)处理器和基础软件构建的全栈 AI计算基础设施、行业应用及服务,https://devpress.csdn.net/organization/setting/general/146749包括昇腾系列处理器、系列硬件、CANN、AI计算框架、应用使能、开发工具链、管理运维工具、行业应用及服务等全产业链
更多推荐

所有评论(0)