引言

《Ray Tracing In One Weekend》（《一周末搞定光线追踪》）， 由Peter Shirley（就是那本图形学虎书的作者）所编写的的软渲光追三部曲第一本, 是一本非常好的入门级书籍, 篇幅不多, 一共只有54页, 适合新手学习。

概述

I’ve taught many graphics classes over the years. Often I do them in ray tracing, because you are forced to write all the code, but you can still get cool images with no API. I decided to adapt my course notes into a how-to, to get you to a cool program as quickly as possible. It will not be a full-featured ray tracer, but it does have the indirect lighting which has made ray tracing a staple in movies. Follow these steps, and the architecture of the ray tracer you produce will be good for extending to a more extensive ray tracer if you get excited and want to pursue that.

When somebody says “ray tracing” it could mean many things. What I am going to describe is technically a path tracer, and a fairly general one. While the code will be pretty simple (let the computer do the work!) I think you’ll be very happy with the images you can make.

I’ll take you through writing a ray tracer in the order I do it, along with some debugging tips. By the end, you will have a ray tracer that produces some great images. You should be able to do this in a weekend. If you take longer, don’t worry about it. I use C++ as the driving language, but you don’t need to. However, I suggest you do, because it’s fast, portable, and most production movie and video game renderers are written in C++. Note that I avoid most “modern features” of C++, but inheritance and operator overloading are too useful for ray tracers to pass on. I do not provide the code online, but the code is real and I show all of it except for a few straightforward operators in the vec3 class. I am a big believer in typing in code to learn it, but when code is available I use it, so I only practice what I preach when the code is not available. So don’t ask!

I have left that last part in because it is funny what a 180 I have done. Several readers ended up with subtle errors that were helped when we compared code. So please do type in the code, but if you want to look at mine it is at:

https://github.com/RayTracing/raytracing.github.io/

https://github.com/RayTracing/raytracing.github.io/

I assume a little bit of familiarity with vectors (like dot product and vector addition). If you don’t know that, do a little review. If you need that review, or to learn it for the first time, check out Marschner’s and my graphics text, Foley, Van Dam, et al., or McGuire’s graphics codex.

If you run into trouble, or do something cool you’d like to show somebody, send me some email at ptrshrl@gmail.com.

I’ll be maintaining a site related to the book including further reading and links to resources at a blog https://in1weekend.blogspot.com/ related to this book.

Thanks to everyone who lent a hand on this project. You can find them in the acknowledgments section at the end of this book.

Let’s get on with it!

输出图像

The PPM Image Format

Whenever you start a renderer, you need a way to see an image. The most straightforward way is to write it to a file. The catch is, there are so many formats. Many of those are complex. I always start with a plain text ppm file. Here’s a nice description from Wikipedia:

P3下面那两个数指的是列数和行数，或者你可以理解为 x 轴的长度和 y 轴的长度，或者理解为图像的宽和高。再下面一行有一个数为像素的最大值，之后有 x * y 个（r, g, b）三元组，它会按照顺序读取，并且在图像的左上角开始一行一行扫描设置像素值。

Let’s make some C++ code to output such a thing:

There are some things to note in that code:

1. The pixels are written out in rows with pixels left to right.
2. The rows are written out from top to bottom.
3. By convention, each of the red/green/blue components range from 0.0 to 1.0. We will relax that later when we internally use high dynamic range, but before output we will tone map to the zero to one range, so this code won’t change.
4. Red goes from fully off (black) to fully on (bright red) from left to right, and green goes from black at the bottom to fully on at the top. Red and green together make yellow so we should expect the upper right corner to be yellow.

1. 对于像素来说, 每一行是从左往右写入的。

2. 行从上开始往下写入的。

3. 通常我们把RGB通道的值限定在0.0到1.0。我们之后计算颜色值的时候将使用一个动态的范围, 这个范围并不是0到1。但是在使用这段代码输出图像之前, 我们将把颜色映射到0到1。所以这部分输出图像代码不会改变。

4. 下方的红色从左到右由黑边红, 左侧的绿色从上到下由黑到绿。红+绿变黄, 所以我们的右上角应该是黄的。

• auto关键字就是变量的自动类型推断，可以在声明变量的时候根据变量初始值的类型自动为此变量选择匹配的类型，类似的关键字还有decltype

• static_cast相当于传统的C语言里的强制转换，该运算符把expression转换为new_type类型，用来强迫隐式转换，例如non-const对象转为const对象，编译时检查，用于非多态的转换，可以转换指针及其他，但没有运行时类型检查来保证转换的安全性。它主要有如下几种用法：

1. 用于类层次结构中基类（父类）和派生类（子类）之间指针或引用的转换

进行上行转换（把派生类的指针或引用转换成基类表示）是安全的

进行下行转换（把基类指针或引用转换成派生类表示）时，由于没有动态类型检查，所以是不安全的

2. 用于基本数据类型之间的转换，如把int转换成char，把int转换成enum。这种转换的安全性也要开发人员来保证

3. 把空指针转换成目标类型的空指针

4. 把任何类型的表达式转换成void类型

注意：static_cast不能转换掉expressionconstvolatile、或者__unaligned属性

• 这里行列的变化令人费解，第一层循环负责image_height也就是行的话显然是自顶向上的写入，而文中却说”The rows are written out from top to bottom.”，让我感到困惑，可能是为了特定的色彩而故意为之吧

Creating an Image File

Because the file is written to the program output, you’ll need to redirect it to an image file. Typically this is done from the command-line by using the > redirection operator, like so:

This is how things would look on Windows. On Mac or Linux, it would look like this:

Opening the output file (in ToyViewer on my Mac, but try it in your favorite viewer and Google “ppm viewer” if your viewer doesn’t support it) shows this result:

Hooray! This is the graphics “hello world”. If your image doesn’t look like that, open the output file in a text editor and see what it looks like. It should start something like this:

If it doesn’t, then you probably just have some newlines or something similar that is confusing the image reader.

If you want to produce more image types than PPM, I am a fan of stb_image.h, a header-only image library available on GitHub at https://github.com/nothings/stb.

Adding a Progress Indicator

Before we continue, let’s add a progress indicator to our output. This is a handy way to track the progress of a long render, and also to possibly identify a run that’s stalled out due to an infinite loop or other problem.

Our program outputs the image to the standard output stream (std::cout), so leave that alone and instead write to the error output stream (std::cerr):

vec3向量类

Almost all graphics programs have some class(es) for storing geometric vectors and colors. In many systems these vectors are 4D (3D plus a homogeneous coordinate for geometry, and RGB plus an alpha transparency channel for colors). For our purposes, three coordinates suffices. We’ll use the same class vec3 for colors, locations, directions, offsets, whatever. Some people don’t like this because it doesn’t prevent you from doing something silly, like adding a color to a location. They have a good point, but we’re going to always take the “less code” route when not obviously wrong. In spite of this, we do declare two aliases for vec3: point3 and color. Since these two types are just aliases for vec3, you won’t get warnings if you pass a color to a function expecting a point3, for example. We use them only to clarify intent and use.

Variables and Methods

Here’s the top part of my vec3 class:

We use double here, but some ray tracers use float. Either one is fine — follow your own tastes.

vec3 Utility Functions

The second part of the header file contains vector utility functions:

光线, 简单摄像机, 以及背景

The ray Class

The one thing that all ray tracers have is a ray class and a computation of what color is seen along a ray. Let’s think of a ray as a function P(𝑡)=A+𝑡b. Here P is a 3D position along a line in 3D. A is the ray origin and b is the ray direction. The ray parameter 𝑡t is a real number (double in the code). Plug in a different 𝑡t and P(𝑡) moves the point along the ray. Add in negative 𝑡t values and you can go anywhere on the 3D line. For positive 𝑡t, you get only the parts in front of A, and this is what is often called a half-line or ray.

The function P(𝑡) in more verbose code form I call ray::at(t):

Sending Rays Into the Scene

Now we are ready to turn the corner and make a ray tracer. At the core, the ray tracer sends rays through pixels and computes the color seen in the direction of those rays. The involved steps are (1) calculate the ray from the eye to the pixel, (2) determine which objects the ray intersects, and (3) compute a color for that intersection point. When first developing a ray tracer, I always do a simple camera for getting the code up and running. I also make a simple ray_color(ray) function that returns the color of the background (a simple gradient).

I’ve often gotten into trouble using square images for debugging because I transpose 𝑥x and 𝑦y too often, so I’ll use a non-square image. For now we’ll use a 16:9 aspect ratio, since that’s so common.

In addition to setting up the pixel dimensions for the rendered image, we also need to set up a virtual viewport through which to pass our scene rays. For the standard square pixel spacing, the viewport’s aspect ratio should be the same as our rendered image. We’ll just pick a viewport two units in height. We’ll also set the distance between the projection plane and the projection point to be one unit. This is referred to as the “focal length”, not to be confused with “focus distance”, which we’ll present later.

I’ll put the “eye” (or camera center if you think of a camera) at (0,0,0)(0,0,0). I will have the y-axis go up, and the x-axis to the right. In order to respect the convention of a right handed coordinate system, into the screen is the negative z-axis. I will traverse the screen from the upper left hand corner, and use two offset vectors along the screen sides to move the ray endpoint across the screen. Note that I do not make the ray direction a unit length vector because I think not doing that makes for simpler and slightly faster code.

Below in code, the ray r goes to approximately the pixel centers (I won’t worry about exactness for now because we’ll add antialiasing later):

The ray_color(ray) function linearly blends white and blue depending on the height of the 𝑦 coordinate after scaling the ray direction to unit length (so $−1.0<𝑦<1.0$). Because we’re looking at the 𝑦 height after normalizing the vector, you’ll notice a horizontal gradient to the color in addition to the vertical gradient.

I then did a standard graphics trick of scaling that to $0.0≤𝑡≤1.0$. When $𝑡=1.0$ I want blue. When $𝑡=0.0$ I want white. In between, I want a blend. This forms a “linear blend”, or “linear interpolation”, or “lerp” for short, between two things. A lerp is always of the form

ray_color(ray)函数根据y值将蓝白做了个线性差值的混合, 我们这里把射线做了个单位化, 以保证y的取值范围($−1.0<y<1.0$)。因为我们使用y轴做渐变, 所以你可以看到这个蓝白渐变也是竖直的。

$$blendedValue=(1−t)⋅startValue+t⋅endValue$$

with 𝑡 going from zero to one. In our case this produces:

加入球体

Let’s add a single object to our ray tracer. People often use spheres in ray tracers because calculating whether a ray hits a sphere is pretty straightforward.

Ray-Sphere Intersection

Recall that the equation for a sphere centered at the origin of radius 𝑅R is $x^2+y^2+z^2=R^2$. Put another way, if a given point $(𝑥,𝑦,𝑧)$ is on the sphere, then $x^2+y^2+z^2=R^2$. If the given point (𝑥,𝑦,𝑧)(x,y,z) is inside the sphere, then $x^2+y^2+z^2<R^2$, and if a given point$(𝑥,𝑦,𝑧)$ is *outside* the sphere, then $x^2+y^2+z^2>R^2$.

It gets uglier if the sphere center is at (𝐶𝑥,𝐶𝑦,𝐶𝑧):

$$(x−c_x)^2+(y−c_y)^2+(z−c_z)^2=R^2$$

$$(p−c)⋅(p−c)=(x−c_x)^2+(y−c_y)^2+(z−c_z)^2$$

So the equation of the sphere in vector form is:

$$(p−c)⋅(p−c)=R^2$$

We can read this as “any point 𝐏 that satisfies this equation is on the sphere”. We want to know if our ray $P(t)=A+tb$ ever hits the sphere anywhere. If it does hit the sphere, there is some 𝑡t for which $P(t)$satisfies the sphere equation. So we are looking for any 𝑡t where this is true:

$$(p(t)−c)⋅(p(t)−c)=R^2$$

or expanding the full form of the ray $P(t)$:

$$(a+t \vec b−c)⋅(a+t \vec b−c)=R^2$$

The rules of vector algebra are all that we would want here. If we expand that equation and move all the terms to the left hand side we get:

$$(t^2 \vec b + \vec{(a - c)})⋅(t^2 \vec b + \vec{(a - c)}) - R^2 = 0$$

$$t^2 \vec b⋅\vec b+2t \vec b⋅\vec{(a−c)}+ \vec{(a−c)}⋅\vec {(a−c)}−R^2=0$$

The vectors and 𝑟r in that equation are all constant and known. The unknown is 𝑡, and the equation is a quadratic, like you probably saw in your high school math class. You can solve for 𝑡t and there is a square root part that is either positive (meaning two real solutions), negative (meaning no real solutions), or zero (meaning one real solution). In graphics, the algebra almost always relates very directly to the geometry. What we have is:

Creating Our First Raytraced Image

If we take that math and hard-code it into our program, we can test it by coloring red any pixel that hits a small sphere we place at −1 on the z-axis:

面法线与复数物体

Shading with Surface Normals

First, let’s get ourselves a surface normal so we can shade. This is a vector that is perpendicular to the surface at the point of intersection. There are two design decisions to make for normals. The first is whether these normals are unit length. That is convenient for shading so I will say yes, but I won’t enforce that in the code. This could allow subtle bugs, so be aware this is personal preference as are most design decisions like that. For a sphere, the outward normal is in the direction of the hit point minus the center:

On the earth, this implies that the vector from the earth’s center to you points straight up. Let’s throw that into the code now, and shade it. We don’t have any lights or anything yet, so let’s just visualize the normals with a color map. A common trick used for visualizing normals (because it’s easy and somewhat intuitive to assume 𝐧 is a unit length vector — so each component is between −1 and 1) is to map each component to the interval from 0 to 1, and then map x/y/z to r/g/b. For the normal, we need the hit point, not just whether we hit or not. We only have one sphere in the scene, and it’s directly in front of the camera, so we won’t worry about negative values of 𝑡t yet. We’ll just assume the closest hit point (smallest 𝑡t). These changes in the code let us compute and visualize 𝐧:

Simplifying the Ray-Sphere Intersection Code

Let’s revisit the ray-sphere equation:

First, recall that a vector dotted with itself is equal to the squared length of that vector.

Second, notice how the equation for b has a factor of two in it. Consider what happens to the quadratic equation if 𝑏=2ℎ:

$\frac{-b +/- \sqrt{b^2-4ac}}{2a} = \frac{-2h +/- \sqrt{(2h)^2-4ac}}{2a} = \frac{-h +/- \sqrt{h^2-ac}}{a}$

Using these observations, we can now simplify the sphere-intersection code to this:

An Abstraction for Hittable Objects

Now, how about several spheres? While it is tempting to have an array of spheres, a very clean solution is the make an “abstract class” for anything a ray might hit, and make both a sphere and a list of spheres just something you can hit. What that class should be called is something of a quandary — calling it an “object” would be good if not for “object oriented” programming. “Surface” is often used, with the weakness being maybe we will want volumes. “hittable” emphasizes the member function that unites them. I don’t love any of these, but I will go with “hittable”.

This hittable abstract class will have a hit function that takes in a ray. Most ray tracers have found it convenient to add a valid interval for hits 𝑡𝑚𝑖𝑛 to 𝑡𝑚𝑎𝑥, so the hit only “counts” if $t_{min} < t < t_{max}$ . For the initial rays this is positive 𝑡, but as we will see, it can help some details in the code to have an interval $t_{min}$ to $t_{max}$ . One design question is whether to do things like compute the normal if we hit something. We might end up hitting something closer as we do our search, and we will only need the normal of the closest thing. I will go with the simple solution and compute a bundle of stuff I will store in some structure. Here’s the abstract class:

hittable类理应有个接受射线为参数的函数, 许多光线追踪器为了便利, 加入了一个区间$t_{min}<t<t_{max}$来判断相交是否有效。对于一开始的光线来说, 这个t值总是正的, 但加入这部分对代码实现的一些细节有着不错的帮助。现在有个设计上的问题:我们是否在每次计算求交的时候都要去计算法相?但其实我们只需要计算离射线原点最近的那个交点的法相就行了, 后面的东西会被遮挡。接下来我会给出我的代码, 并将一些计算的结果存在一个结构体里, 来看, 这就是那个抽象类:

And here’s the sphere:

Front Faces Versus Back Faces

The second design decision for normals is whether they should always point out. At present, the normal found will always be in the direction of the center to the intersection point (the normal points out). If the ray intersects the sphere from the outside, the normal points against the ray. If the ray intersects the sphere from the inside, the normal (which always points out) points with the ray. Alternatively, we can have the normal always point against the ray. If the ray is outside the sphere, the normal will point outward, but if the ray is inside the sphere, the normal will point inward.

We need to choose one of these possibilities because we will eventually want to determine which side of the surface that the ray is coming from. This is important for objects that are rendered differently on each side, like the text on a two-sided sheet of paper, or for objects that have an inside and an outside, like glass balls.

If we decide to have the normals always point out, then we will need to determine which side the ray is on when we color it. We can figure this out by comparing the ray with the normal. If the ray and the normal face in the same direction, the ray is inside the object, if the ray and the normal face in the opposite direction, then the ray is outside the object. This can be determined by taking the dot product of the two vectors, where if their dot is positive, the ray is inside the sphere.

If we decide to have the normals always point against the ray, we won’t be able to use the dot product to determine which side of the surface the ray is on. Instead, we would need to store that information:

We can set things up so that normals always point “outward” from the surface, or always point against the incident ray. This decision is determined by whether you want to determine the side of the surface at the time of geometry intersection or at the time of coloring. In this book we have more material types than we have geometry types, so we’ll go for less work and put the determination at geometry time. This is simply a matter of preference, and you’ll see both implementations in the literature.

We add the front_face bool to the hit_record struct. We’ll also add a function to solve this calculation for us.

And then we add the surface side determination to the class:

Some New C++ Features

The hittable_list class code uses two C++ features that may trip you up if you’re not normally a C++ programmer: vector and shared_ptr.

hittable_list类使用了两种C++的特性:vectorshared_ptr, 如果你并不熟悉C++, 你可能会感到有些困惑。

shared_ptr

shared_ptr<type> is a pointer to some allocated type, with reference-counting semantics. Every time you assign its value to another shared pointer (usually with a simple assignment), the reference count is incremented. As shared pointers go out of scope (like at the end of a block or function), the reference count is decremented. Once the count goes to zero, the object is deleted.

shared_ptr<type>是指向一些已分配内存的类型的指针。每当你将它的值赋值给另一个智能指针时, 物体的引用计数器就会+1。当智能指针离开它所在的生存范围(例如代码块或者函数外), 物体的引用计数器就会-1。一旦引用计数器为0, 即没有任何智能指针指向该物体时, 该物体就会被销毁。

Typically, a shared pointer is first initialized with a newly-allocated object, something like this:

make_shared<thing>(thing_constructor_params ...) allocates a new instance of type thing, using the constructor parameters. It returns a shared_ptr<thing>.

Since the type can be automatically deduced by the return type of make_shared<type>(...), the above lines can be more simply expressed using C++’s auto type specifier:

make_shared<thing>(thing_constructor_params ...)为指定的类型分配一段内存, 使用你指定的构造函数与参数来创建这个类, 并返回一个智能指针shared_ptr<thing>

We’ll use shared pointers in our code, because it allows multiple geometries to share a common instance (for example, a bunch of spheres that all use the same texture map material), and because it makes memory management automatic and easier to reason about.

std::shared_ptr is included with the <memory> header.

std::shared_ptr在头文件<memory>中:

vector

The second C++ feature you may be unfamiliar with is std::vector. This is a generic array-like collection of an arbitrary type. Above, we use a collection of pointers to hittable. std::vector automatically grows as more values are added: objects.push_back(object) adds a value to the end of the std::vector member variable objects.

std::vector is included with the <vector> header.

std::vector在头文件<vector>

Finally, the using statements in listing 20 tell the compiler that we’ll be getting shared_ptr and make_shared from the std library, so we don’t need to prefex these with std:: every time we reference them.

Common Constants and Utility Functions

We need some math constants that we conveniently define in their own header file. For now we only need infinity, but we will also throw our own definition of pi in there, which we will need later. There is no standard portable definition of pi, so we just define our own constant for it. We’ll throw common useful constants and future utility functions in rtweekend.h, our general main header file.

And the new main:

This yields a picture that is really just a visualization of where the spheres are along with their surface normal. This is often a great way to look at your model for flaws and characteristics.

反走样

When a real camera takes a picture, there are usually no jaggies along edges because the edge pixels are a blend of some foreground and some background. We can get the same effect by averaging a bunch of samples inside each pixel. We will not bother with stratification. This is controversial, but is usual for my programs. For some ray tracers it is critical, but the kind of general one we are writing doesn’t benefit very much from it and it makes the code uglier. We abstract the camera class a bit so we can make a cooler camera later.

Some Random Number Utilities

When a real camera takes a picture, there are usually no jaggies along edges because the edge pixels are a blend of some foreground and some background. We can get the same effect by averaging a bunch of samples inside each pixel. We will not bother with stratification. This is controversial, but is usual for my programs. For some ray tracers it is critical, but the kind of general one we are writing doesn’t benefit very much from it and it makes the code uglier. We abstract the camera class a bit so we can make a cooler camera later.

A simple approach to this is to use the rand() function that can be found in <cstdlib>. This function returns a random integer in the range 0 and RAND_MAX. Hence we can get a real random number as desired with the following code snippet, added to rtweekend.h:

C++ did not traditionally have a standard random number generator, but newer versions of C++ have addressed this issue with the <random> header (if imperfectly according to some experts). If you want to use this, you can obtain a random number with the conditions we need as follows:

Generating Pixels with Multiple Samples

For a given pixel we have several samples within that pixel and send rays through each of the samples. The colors of these rays are then averaged:

Now’s a good time to create a camera class to manage our virtual camera and the related tasks of scene scampling. The following class implements a simple camera using the axis-aligned camera from before:

To handle the multi-sampled color computation, we’ll update the write_color() function. Rather than adding in a fractional contribution each time we accumulate more light to the color, just add the full color each iteration, and then perform a single divide at the end (by the number of samples) when writing out the color. In addition, we’ll add a handy utility function to the rtweekend.h utility header: clamp(x,min,max), which clamps the value x to the range [min,max]:

main函数也发生了变化:

1. 在一个像素内取若干个子采样点
2. 对子像素点进行颜色计算（采样）
3. 根据子像素的颜色和位置，利用一个称之为resolve的合成阶段，计算当前像素的最终颜色输出

Zooming into the image that is produced, we can see the difference in edge pixels.

漫反射材质

Now that we have objects and multiple rays per pixel, we can make some realistic looking materials. We’ll start with diffuse (matte) materials. One question is whether we mix and match geometry and materials (so we can assign a material to multiple spheres, or vice versa) or if geometry and material are tightly bound (that could be useful for procedural objects where the geometry and material are linked). We’ll go with separate — which is usual in most renderers — but do be aware of the limitation.

A Simple Diffuse Material

Diffuse objects that don’t emit light merely take on the color of their surroundings, but they modulate that with their own intrinsic color. Light that reflects off a diffuse surface has its direction randomized. So, if we send three rays into a crack between two diffuse surfaces they will each have different random behavior:

They also might be absorbed rather than reflected. The darker the surface, the more likely absorption is. (That’s why it is dark!) Really any algorithm that randomizes direction will produce surfaces that look matte. One of the simplest ways to do this turns out to be exactly correct for ideal diffuse surfaces. (I used to do it as a lazy hack that approximates mathematically ideal Lambertian.)

(Reader Vassillen Chizhov proved that the lazy hack is indeed just a lazy hack and is inaccurate. The correct representation of ideal Lambertian isn’t much more work, and is presented at the end of the chapter.)

(读者Vassillen Chizhov 提供了这个方法, 虽然并不是很精确。我们会在章节最后提准确的lambertian表达式, 而且其并不会很复杂)

There are two unit radius spheres tangent to the hit point 𝑝p of a surface. These two spheres have a center of (𝐏+𝐧) and (𝐏𝐧), where 𝐧n is the normal of the surface. The sphere with a center at (𝐏𝐧) is considered inside the surface, whereas the sphere with center (𝐏+𝐧) is considered outside the surface. Select the tangent unit radius sphere that is on the same side of the surface as the ray origin. Pick a random point 𝐒 inside this unit radius sphere and send a ray from the hit point 𝐏 to the random point 𝐒 (this is the vector (𝐒𝐏)):

We need a way to pick a random point in a unit radius sphere. We’ll use what is usually the easiest algorithm: a rejection method. First, pick a random point in the unit cube where x, y, and z all range from −1 to +1. Reject this point and try again if the point is outside the sphere.

vec3.h中添加：

Then update the ray_color() function to use the new random direction generator:

Limiting the Number of Child Rays

There’s one potential problem lurking here. Notice that the ray_color function is recursive. When will it stop recursing? When it fails to hit anything. In some cases, however, that may be a long time — long enough to blow the stack. To guard against that, let’s limit the maximum recursion depth, returning no light contribution at the maximum depth:

This gives us:

Using Gamma Correction for Accurate Color Intensity

Note the shadowing under the sphere. This picture is very dark, but our spheres only absorb half the energy on each bounce, so they are 50% reflectors. If you can’t see the shadow, don’t worry, we will fix that now. These spheres should look pretty light (in real life, a light grey). The reason for this is that almost all image viewers assume that the image is “gamma corrected”, meaning the 0 to 1 values have some transform before being stored as a byte. There are many good reasons for that, but for our purposes we just need to be aware of it. To a first approximation, we can use “gamma 2” which means raising the color to the power 1/𝑔𝑎𝑚𝑚𝑎, or in our simple case ½, which is just square-root:

That yields light grey, as we desire:

There’s also a subtle bug in there. Some of the reflected rays hit the object they are reflecting off of not at exactly 𝑡=0, but instead at 𝑡=−0.0000001 or 𝑡=0.00000001 or whatever floating point approximation the sphere intersector gives us. So we need to ignore hits very near zero:

This gets rid of the shadow acne problem. Yes it is really called that.

True Lambertian Reflection

The rejection method presented here produces random points in the unit ball offset along the surface normal. This corresponds to picking directions on the hemisphere with high probability close to the normal, and a lower probability of scattering rays at grazing angles. This distribution scales by the cos3(𝜙) where 𝜙 is the angle from the normal. This is useful since light arriving at shallow angles spreads over a larger area, and thus has a lower contribution to the final color.

However, we are interested in a Lambertian distribution, which has a distribution of cos(𝜙). True Lambertian has the probability higher for ray scattering close to the normal, but the distribution is more uniform. This is achieved by picking random points on the surface of the unit sphere, offset along the surface normal. Picking random points on the unit sphere can be achieved by picking random points in the unit sphere, and then normalizing those.

This random_unit_vector() is a drop-in replacement for the existing random_in_unit_sphere() function.

After rendering we get a similar image:

It’s hard to tell the difference between these two diffuse methods, given that our scene of two spheres is so simple, but you should be able to notice two important visual differences:

1. The shadows are less pronounced after the change
2. Both spheres are lighter in appearance after the change

Both of these changes are due to the more uniform scattering of the light rays, fewer rays are scattering toward the normal. This means that for diffuse objects, they will appear lighter because more light bounces toward the camera. For the shadows, less light bounces straight-up, so the parts of the larger sphere directly underneath the smaller sphere are brighter.

1.阴影部分少了
2.大球和小球都变亮了

An Alternative Diffuse Formulation

The initial hack presented in this book lasted a long time before it was proven to be an incorrect approximation of ideal Lambertian diffuse. A big reason that the error persisted for so long is that it can be difficult to:

1. Mathematically prove that the probability distribution is incorrect
2. Intuitively explain why a cos(𝜙) distribution is desirable (and what it would look like)

Not a lot of common, everyday objects are perfectly diffuse, so our visual intuition of how these objects behave under light can be poorly formed.

1.概率分布的数学证明算错了
2.视觉上来说, 并不能直接看出cos(ϕ)的概率分配是我们所需要的

In the interest of learning, we are including an intuitive and easy to understand diffuse method. For the two methods above we had a random vector, first of random length and then of unit length, offset from the hit point by the normal. It may not be immediately obvious why the vectors should be displaced by the normal.

A more intuitive approach is to have a uniform scatter direction for all angles away from the hit point, with no dependence on the angle from the normal. Many of the first raytracing papers used this diffuse method (before adopting Lambertian diffuse).

Gives us the following image:

Scenes will become more complicated over the course of the book. You are encouraged to switch between the different diffuse renderers presented here. Most scenes of interest will contain a disproportionate amount of diffuse materials. You can gain valuable insight by understanding the effect of different diffuse methods on the lighting of the scene.

金属材质

An Abstract Class for Materials

If we want different objects to have different materials, we have a design decision. We could have a universal material with lots of parameters and different material types just zero out some of those parameters. This is not a bad approach. Or we could have an abstract material class that encapsulates behavior. I am a fan of the latter approach. For our program the material needs to do two things:

1. Produce a scattered ray (or say it absorbed the incident ray).
2. If scattered, say how much the ray should be attenuated.

This suggests the abstract class:

1.生成散射后的光线(或者说它吸收了入射光线)
2.如果发生散射, 决定光线会变暗多少(attenuate)

A Data Structure to Describe Ray-Object Intersections

The hit_record is to avoid a bunch of arguments so we can stuff whatever info we want in there. You can use arguments instead; it’s a matter of taste. Hittables and materials need to know each other so there is some circularity of the references. In C++ you just need to alert the compiler that the pointer is to a class, which the “class material” in the hittable class below does:

What we have set up here is that material will tell us how rays interact with the surface. hit_record is just a way to stuff a bunch of arguments into a struct so we can send them as a group. When a ray hits a surface (a particular sphere for example), the material pointer in the hit_record will be set to point at the material pointer the sphere was given when it was set up in main() when we start. When the ray_color() routine gets the hit_record it can call member functions of the material pointer to find out what ray, if any, is scattered.

To achieve this, we must have a reference to the material for our sphere class to returned within hit_record. See the highlighted lines below:

Modeling Light Scatter and Reflectance

For the Lambertian (diffuse) case we already have, it can either scatter always and attenuate by its reflectance 𝑅R, or it can scatter with no attenuation but absorb the fraction 1−𝑅1−R of the rays, or it could be a mixture of those strategies. For Lambertian materials we get this simple class:

Mirrored Light Reflection

Note we could just as well only scatter with some probability 𝑝 and have attenuation be 𝑎𝑙𝑏𝑒𝑑𝑜/𝑝. Your choice.

If you read the code above carefully, you’ll notice a small chance of mischief. If the random unit vector we generate is exactly opposite the normal vector, the two will sum to zero, which will result in a zero scatter direction vector. This leads to bad scenarios later on (infinities and NaNs), so we need to intercept the condition before we pass it on.

For smooth metals the ray won’t be randomly scattered. The key math is: how does a ray get reflected from a metal mirror? Vector math is our friend here:

The reflected ray direction in red is just $v+2b$. In our design, 𝐧 is a unit vector, but 𝐯 may not be. The length of 𝐛 should be $v⋅n$. Because 𝐯 points in, we will need a minus sign, yielding:

The metal material just reflects rays using that formula:

We need to modify the ray_color() function to use this:

A Scene with Metal Spheres

Now let’s add some metal spheres to our scene:

Fuzzy Reflection

We can also randomize the reflected direction by using a small sphere and choosing a new endpoint for the ray:

The bigger the sphere, the fuzzier the reflections will be. This suggests adding a fuzziness parameter that is just the radius of the sphere (so zero is no perturbation). The catch is that for big spheres or grazing rays, we may scatter below the surface. We can just have the surface absorb those.

绝缘体材质

Clear materials such as water, glass, and diamonds are dielectrics. When a light ray hits them, it splits into a reflected ray and a refracted (transmitted) ray. We’ll handle that by randomly choosing between reflection or refraction, and only generating one scattered ray per interaction.

Refraction

The hardest part to debug is the refracted ray. I usually first just have all the light refract if there is a refraction ray at all. For this project, I tried to put two glass balls in our scene, and I got this (I have not told you how to do this right or wrong yet, but soon!):

Is that right? Glass balls look odd in real life. But no, it isn’t right. The world should be flipped upside down and no weird black stuff. I just printed out the ray straight through the middle of the image and it was clearly wrong. That often does the job.

Snell’s Law

The refraction is described by Snell’s law:

$𝜂⋅sin𝜃=𝜂′⋅sin𝜃′$

Where 𝜃 and 𝜃′ are the angles from the normal, and 𝜂 and 𝜂′ (pronounced “eta” and “eta prime”) are the refractive indices (typically air = 1.0, glass = 1.3–1.7, diamond = 2.4). The geometry is:

$𝜂⋅sin𝜃=𝜂^′⋅sin𝜃^′$​

θ与θ′是入射光线与折射光线距离法相的夹角,η与η′(读作eta和eta prime)是介质的折射率(规定空气为1.0, 玻璃为1.3-1.7,钻石为2.4), 如图:

In order to determine the direction of the refracted ray, we have to solve for sin𝜃′:

$sin𝜃′= \frac{𝜂}{𝜂′}⋅sin𝜃$

$sin𝜃′= \frac{𝜂}{𝜂′}⋅sin𝜃$

On the refracted side of the surface there is a refracted ray 𝐑′ and a normal 𝐧′, and there exists an angle, 𝜃′, between them. We can split 𝐑′ into the parts of the ray that are perpendicular to 𝐧′ and parallel to 𝐧′:

$\mathbf{R’} = \mathbf{R’}{\bot} + \mathbf{R’}{\parallel}$​

If we solve for 𝐑′⊥ and 𝐑′∥ we get:

$\mathbf{R’}_{\bot} = \frac{𝜂}{𝜂′}(\mathbf{R} + \cos \theta)\mathbf{n}$

$\mathbf{R’}{\parallel} = -\sqrt{1 - |\mathbf{R’}{\bot}|^2} \mathbf{n}$

$\mathbf{R’} = \mathbf{R’}{\bot} + \mathbf{R’}{\parallel}$

$\mathbf{R’}_{\bot} = \frac{𝜂}{𝜂′}(\mathbf{R} + \cos \theta)\mathbf{n}$

$\mathbf{R’}{\parallel} = -\sqrt{1 - |\mathbf{R’}{\bot}|^2} \mathbf{n}$

You can go ahead and prove this for yourself if you want, but we will treat it as fact and move on. The rest of the book will not require you to understand the proof.

We still need to solve for cos𝜃cos⁡θ. It is well known that the dot product of two vectors can be explained in terms of the cosine of the angle between them:

$a⋅b=|a||b|\cosθ$

If we restrict 𝐚 and 𝐛 to be unit vectors:

$a⋅b=\cos𝜃$​​

We can now rewrite 𝐑′⊥ in terms of known quantities:

$𝐑^′_⊥=\frac{𝜂}{𝜂′}(𝐑+(−𝐑⋅𝐧)𝐧)$

$a⋅b=|a||b|\cosθ$​​

$a⋅b=\cos𝜃$​​

$𝐑^′_⊥=\frac{𝜂}{𝜂′}(𝐑+(−𝐑⋅𝐧)𝐧)$

Total Internal Reflection

That definitely doesn’t look right. One troublesome practical issue is that when the ray is in the material with the higher refractive index, there is no real solution to Snell’s law, and thus there is no refraction possible. If we refer back to Snell’s law and the derivation of sin𝜃′:

$\sin𝜃^′=\frac{𝜂}{𝜂′}⋅\sin𝜃$​​

If the ray is inside glass and outside is air (𝜂=1.5 and 𝜂′=1.0):

$\sin𝜃^′=\frac{1.5}{1.0}⋅\sin𝜃$​

The value of sin𝜃′sin⁡θ′ cannot be greater than 1. So, if,

$\frac{1.5}{1.0}⋅\sin𝜃>1.0$​

the equality between the two sides of the equation is broken, and a solution cannot exist. If a solution does not exist, the glass cannot refract, and therefore must reflect the ray:

$\sin𝜃^′=\frac{𝜂}{𝜂′}⋅\sin𝜃$​​

$\sin𝜃^′=\frac{1.5}{1.0}⋅\sin𝜃$​

$\frac{1.5}{1.0}⋅\sin𝜃>1.0$​

Here all the light is reflected, and because in practice that is usually inside solid objects, it is called “total internal reflection”. This is why sometimes the water-air boundary acts as a perfect mirror when you are submerged.

We can solve for sin_theta using the trigonometric qualities:

$\sinθ=\sqrt{1−\cos{2θ}}$

and

$\cosθ = R ⋅ N$

$\sinθ=\sqrt{1−\cos{2θ}}$​

$\cosθ = R ⋅ N$

And the dielectric material that always refracts (when possible) is:

Attenuation is always 1 — the glass surface absorbs nothing. If we try that out with these parameters:

We get:

Schlick Approximation

Now real glass has reflectivity that varies with angle — look at a window at a steep angle and it becomes a mirror. There is a big ugly equation for that, but almost everybody uses a cheap and surprisingly accurate polynomial approximation by Christophe Schlick. This yields our full glass material:

Modeling a Hollow Glass Sphere

An interesting and easy trick with dielectric spheres is to note that if you use a negative radius, the geometry is unaffected, but the surface normal points inward. This can be used as a bubble to make a hollow glass sphere:

可自定义位置的摄像机

Cameras, like dielectrics, are a pain to debug. So I always develop mine incrementally. First, let’s allow an adjustable field of view (fov). This is the angle you see through the portal. Since our image is not square, the fov is different horizontally and vertically. I always use vertical fov. I also usually specify it in degrees and change to radians inside a constructor — a matter of personal taste.

Camera Viewing Geometry

I first keep the rays coming from the origin and heading to the 𝑧=−1 plane. We could make it the 𝑧=−2 plane, or whatever, as long as we made ℎh a ratio to that distance. Here is our setup:

This implies $ℎ=\tan(\frac{𝜃}{2})$​. Our camera now becomes:

When calling it with camera cam(90, aspect_ratio) and these spheres:

Positioning and Orienting the Camera

To get an arbitrary viewpoint, let’s first name the points we care about. We’ll call the position where we place the camera lookfrom, and the point we look at lookat. (Later, if you want, you could define a direction to look in instead of a point to look at.)

We also need a way to specify the roll, or sideways tilt, of the camera: the rotation around the lookat-lookfrom axis. Another way to think about it is that even if you keep lookfrom and lookat constant, you can still rotate your head around your nose. What we need is a way to specify an “up” vector for the camera. This up vector should lie in the plane orthogonal to the view direction.

We can actually use any up vector we want, and simply project it onto this plane to get an up vector for the camera. I use the common convention of naming a “view up” (vup) vector. A couple of cross products, and we now have a complete orthonormal basis (𝑢,𝑣,𝑤) to describe our camera’s orientation.

Remember that vup, v, and w are all in the same plane. Note that, like before when our fixed camera faced -Z, our arbitrary view camera faces -w. And keep in mind that we can — but we don’t have to — use world up (0,1,0) to specify vup. This is convenient and will naturally keep your camera horizontally level until you decide to experiment with crazy camera angles.

We’ll change back to the prior scene, and use the new viewpoint:

to get:

And we can change field of view:

to get:

散焦模糊

Now our final feature: defocus blur. Note, all photographers will call it “depth of field” so be aware of only using “defocus blur” among friends.

The reason we defocus blur in real cameras is because they need a big hole (rather than just a pinhole) to gather light. This would defocus everything, but if we stick a lens in the hole, there will be a certain distance where everything is in focus. You can think of a lens this way: all light rays coming from a specific point at the focus distance — and that hit the lens — will be bent back to a single point on the image sensor.

In a physical camera, the focus distance is controlled by the distance between the lens and the film/sensor. That is why you see the lens move relative to the camera when you change what is in focus (that may happen in your phone camera too, but the sensor moves). The “aperture” is a hole to control how big the lens is effectively. For a real camera, if you need more light you make the aperture bigger, and will get more defocus blur. For our virtual camera, we can have a perfect sensor and never need more light, so we only have an aperture when we want defocus blur.

A Thin Lens Approximation

A real camera has a complicated compound lens. For our code we could simulate the order: sensor, then lens, then aperture. Then we could figure out where to send the rays, and flip the image after it’s computed (the image is projected upside down on the film). Graphics people, however, usually use a thin lens approximation:

We don’t need to simulate any of the inside of the camera. For the purposes of rendering an image outside the camera, that would be unnecessary complexity. Instead, I usually start rays from the lens, and send them toward the focus plane (focus_dist away from the lens), where everything on that plane is in perfect focus.

Generating Sample Rays

Normally, all scene rays originate from the lookfrom point. In order to accomplish defocus blur, generate random scene rays originating from inside a disk centered at the lookfrom point. The larger the radius, the greater the defocus blur. You can think of our original camera as having a defocus disk of radius zero (no blur at all), so all rays originated at the disk center (lookfrom).

Using a big aperture:

We get:

We get:

接下来学什么

A Final Render

First let’s make the image on the cover of this book — lots of random spheres:

An interesting thing you might note is the glass balls don’t really have shadows which makes them look like they are floating. This is not a bug — you don’t see glass balls much in real life, where they also look a bit strange, and indeed seem to float on cloudy days. A point on the big sphere under a glass ball still has lots of light hitting it because the sky is re-ordered rather than blocked.

Next Steps

You now have a cool ray tracer! What next?

1. Lights — You can do this explicitly, by sending shadow rays to lights, or it can be done implicitly by making some objects emit light, biasing scattered rays toward them, and then downweighting those rays to cancel out the bias. Both work. I am in the minority in favoring the latter approach.
2. Triangles — Most cool models are in triangle form. The model I/O is the worst and almost everybody tries to get somebody else’s code to do this.
3. Surface Textures — This lets you paste images on like wall paper. Pretty easy and a good thing to do.
4. Solid textures — Ken Perlin has his code online. Andrew Kensler has some very cool info at his blog.
5. Volumes and Media — Cool stuff and will challenge your software architecture. I favor making volumes have the hittable interface and probabilistically have intersections based on density. Your rendering code doesn’t even have to know it has volumes with that method.
6. Parallelism — Run 𝑁 copies of your code on 𝑁 cores with different random seeds. Average the 𝑁 runs. This averaging can also be done hierarchically where 𝑁/2 pairs can be averaged to get 𝑁/4 images, and pairs of those can be averaged. That method of parallelism should extend well into the thousands of cores with very little coding.

Have fun, and please send me your cool images!

1. 光照。你可以使用阴影光线来显式实现这部分, 也可以使用产生光线的材质来隐式实现*。

2. 偏移散射光线, 然后降低这些光线的权重来消除偏移。这两种都行。硬要说的话, 我偏向后者一点点。【我猜这句话是在说消除自相交所导致的阴影 即Shadow Ance, 如果有人知道这是在说什么请教教我吧！】

3. 加入三角形。大部分模型都是三角网格。模型的IO部分是最恶心的, 基本上所有人都不想自己写, 都去找别人的代码用。

4. 表面纹理*。这可以让你像贴墙纸一样把图片贴到物体上去。实现起来也很简单。

5. 固体纹理*。可以参见Ken Perlin的在线代码, Andrew Kensler的blog中也有关于这部分的信息。

6. 体积体(volumes 即雾等)*与其他介质。很Cool, 但是会改变你的代码构筑。我喜欢把体积体也设计成hittable的子类, 根据其密度来随机决定光线是否与其相交。使用这个方法, 你的渲染器甚至不用知道你渲的是体积体就渲出来了。

7. 并行优化。使用不同的随机种子, 把你的代码复制上N份跑在N个核心上,然后再求平均值。你可以分层来完成这部分工作, 比如分成N/2对, 每次平均求出N/4【为什么是N/4啊？？这翻译翻不下去了！】的图片, 然后在对这些对之间求平均值。这应该用不了多少代码【试试CUDA吧】。