Behind the Guesses: April 2009

[Click here for a PDF of this post with nicer formatting]

A bad way

The dot product and cross product of two vectors are tools which are heavily used in physics. As such, they are typically introduced at the beginning of first semester physics courses, just after vector addition, subtraction, etc. Although they are not strictly required for these intro courses (see [1], for example), they make the development and computations of work and energy, torque, and electromagnetism far simpler.

Unfortunately, they are consistently introduced in an awful way: by straight definition. That is, using the dot product for example, for two vectors $\vec{A}=A_x\hat{x}+A_y\hat{y}+A_z\hat{z}$ and $\vec{B}=B_x\hat{x}+B_y\hat{y}+B_z\hat{z}$ they say something like

We define the dot product between $\vec{A}$ and $\vec{B}$ as:
$\vec{A}\cdot\vec{B}=A_xB_x+A_yB_y+A_zB_z,$
or,
$\vec{A}\cdot\vec{B}=|\vec{A}|\,|\vec{B}|\cos\theta,$
where $\theta$ is the angle between them.

Then, for the cross product, either they use an equation like the latter of the above two equations coupled with the ``right-hand rule,'' or a strange algebraic combination of the components of $\vec{A}$ and $\vec{B}$ , often ``simplified'' with help of a startling determinant.¹ See [2], [3], [4], [5] and [6] as a few examples. Although a few of these give a geometric interpretation after the fact, it is usually in passing, and does not really contribute to their discussion. These approaches are not limited to textbooks, either. See [7] for an in-class lecture example.

In these examples, the dot product is introduced first and then the cross product. From one standpoint this makes some sense -- the dot product is definitionally simpler and usually easier to calculate. However, from a conceptual standpoint, I think this order is backwards. Furthermore, in my experience, students, by and large, miss the physical and graphical significance of these definitions, and upon encountering the concepts of work or torque later on, take the resulting expressions purely as definitions as well.² This is yet another example of the fact that definition $\neq$ explanation.

Personally, it is my inclination to wait to introduce these products until they're needed, thus motivating the discussion in the first place. However, I do understand the notion of ``getting it over with,'' and, it's possible that introducing them as abstract concepts lends to easier application of the concepts to general problems. In any case, my discussion follows the latter approach (for better insertion into standard texts) and presupposes understanding of vector basics: addition, decomposition, etc..

A better way

The Cross Product

(a) Geometrical view of the cross product as the parallelogram area.

(b) Graphical derivation of area for two 2D arbitrary vectors, from [8].

Figure 1: The 2D cross product of vectors $\vec{A}$ and $\vec{B}$ .

Say we have two vectors $\vec{A}$ and $\vec{B}$ with lengths $A$ and $B$ , and we want to find something which is a measure of how much of $\vec{B}$ is perpendicular to $\vec{A}$ . Looking at Fig. 1(a), we can see that the area of the parallelogram sided by the two vectors is such a measure. The area of a parallelogram is

$\text{area}=(\text{base})\times(\text{height}),$

(1)

which, for our case, is the same as

$\text{area}=(\text{length of one vector})\times(\text{amount of the other vector perpendicular to the first}).$

(2)

That is, you can only have an area if you have a ``base'' and a ``height'' perpendicular to the base. Thus area is a good measure of perpendicularity.³

There are two different ways of calculating this area. If the angle between the two vectors is $\theta$ , as in Fig. 1(a), we see that, choosing $\vec{A}$ as the ``base'' we can write the ``height'' as $B \sin\theta$ . Alternatively, choosing $\vec{B}$ as the base, we write the perpendicular part as $A \sin\theta$ . Then the area is

$\text{area}=(A)\,(B \sin\theta)=(B)\,(A\sin\theta)$ .

(3)

However, if we don't know angle between them, we're not completely out of luck. If you look at Fig. 1(b), you can see that for a simple, two-dimensional case, we can express the area in terms of the $x$ and $y$ components of $\vec{A}$ and $\vec{B}$ :

$\text{area}=A_xB_y-B_xA_y$

(4a)

Of course, I could just have easily labeled the axes $y$ and $z$ which would give a different area

$\text{area}'=A_yB_z-B_yA_z,$

(4b)

or $z$ and $x$ , which would give yet another area

$\text{area}''=A_zB_x-B_zA_x$

(4c)

If all we've done is relabel our axes, keeping $\vec{A}$ and $\vec{B}$ fixed, then we wouldn't expect the size of these areas to be different -- and they're not. However, although the amount of area is the same, in a way the areas are different in that they're facing different directions in each case. So, we need a way to distinguish these three areas from each other, and from an arbitrarily oriented area. What we'll do is pick a vector perpendicular to both $\vec{A}$ to $\vec{B}$ -- and thus perpendicular to the area of the parallelogram -- with magnitude equal to the area. We'll call this vector

$\vec{C}=\vec{A}\times\vec{B},$

(5)

and say it's the result of a ``cross product'' of $\vec{A}$ and $\vec{B}$ . However, in principle, we have a choice of two such perpendicular vectors. In Fig. 1, for example, we could choose the vector pointing in either the $+z$ or $-z$ direction. Additionally, this arbitrariness can be seen in choosing whether to measure the angle in (3) from $\vec{A}$ to $\vec{B}$ or vise-versa.

So, as a matter of convention, we'll decide to always measure angles from the first term in the cross product ( $\vec{A}$ in (5)) such that

$\hat{x}\times\hat{y}=+\hat{z} ,$

(6)

so if the fingers in your right hand point along the little arcs we draw for angles, your thumb points in the direction that this vector goes. Thus,

$\vec{A}\times\vec{B}=-\vec{B}\times\vec{A},$

(7)

since your hand would curl in the other direction. This is called the ``Right-Hand Rule.'' Then, the areas we discussed in equations (4) become

$\text{area}_{xy}=(A_xB_y-B_xA_y)\hat{z},$

(8a)

$\text{area}'_{yz}=(A_yB_z-B_yA_z)\hat{x},$

(8b)

and

$\text{area}''_{zx}=(A_zB_x-B_zA_x)\hat{y},$

(8c)

where the subscripts tell us which coordinate plane the two crossed vectors are in. Thus, the cross product represents how much these two vectors point in perpendicular directions, and is a signed area vector perpendicular to the plane described by $\vec{A}$ and $\vec{B}$ .

(a) Geometrical view of the 3D cross product as the parallelogram area.

(b) Looking at the area from the xy-plane (dashed outline), the yz-plane (shaded) and the zx-plane (solid).

Figure 2: The 3D cross product of vectors $\vec{A}$ and $\vec{B}$ and the decomposed area.

So far, though, we've only discussed vectors which have only two coplanar components. But it's fairly straightforward to generalize to arbitrary 3D vectors. See Fig. 2(a), for example. Here the area vector, and hence the cross product vector, is pointing in a complicated direction. However, we know we can decompose any vector into its $x$ , $y$ and $z$ components, and this area vector is no different:

$\vec{A}\times\vec{B}=\text{area}_{3D}=(\text{z area})\hat{z}+(\text{x area})\hat{x}+(\text{y area})\hat{y}$

(9)

All we need to do is find out how much area is pointing in each direction. To do that, look at Fig. 2(b). This picture shows what the area between the two vectors looks like if we look only at two coplanar components at a time -- in other words the $z$ , $x$ and $y$ components of the area. But we already know what each of these areas are from (8)! So, then we can combine these equations and write the cross product

$\vec{A}\times\vec{B}=(A_yB_z-B_yA_z)\hat{x}+(A_zB_x-B_zA_x)\hat{y}+(A_xB_y-B_xA_y)\hat{z}$

(10)

The Dot Product

(a) When B < A.

(b) When B > A.

Figure 3: The projection of vector $\vec{B}$ on to vector $\vec{A}$ .

Having discussed the perpendicularity of two vectors, it's natural to ask if there's a similar measure of the parallelity of two vectors. There are two ways of doing this. The way I'll do it first is explicitly geometrical, the second way is only implicitly geometrical.

Say we have two vectors $\vec{A}$ and $\vec{B}$ again, and we want to know how much of $\vec{B}$ is pointing (projected) along $\vec{A}$ . From Fig. 3 we see that this is equal to

$B \cos \theta.$

(11)

Similarly, the amount of $\vec{A}$ that is projected along $\vec{B}$ is

$A \cos \theta.$

(12)

Now, it would be nice if we could have one statement which somehow combined the these two statements and gave a measure both of how much of $\vec{A}$ is pointing along $\vec{B}$ and of how much of $\vec{B}$ is pointing along $\vec{A}$ ; that is, a measure of how much these two vectors point in the same direction. Additionally, since (2) used a multiplicative combination of the two vectors as a measure of perpendicularity, we'll try a similar multiplicative measure here, as well.

If we multiply (11) by $A$ and (12) by $B$ we can write a single, symmetric statement

$D=\vec{A}\cdot\vec{B}=AB\cos\theta,$

(13)

and say it's the result of a ``dot product'' of $\vec{A}$ and $\vec{B}$ , which amounts to multiplying together the parallel parts of two vectors. Here, too, if we don't know the angle between them, we're not out of luck. For a vector written in component form, it's straightforward to multiply the parallel parts together:

$\vec{A}\cdot\vec{B}=A_xB_x+A_yB_y+A_zB_z.$

(14)

However, unlike the cross product which gave us an actual area with a natural direction, this area-like structure is actually a measure of ``non-area'' and doesn't really have a natural direction. Although we could, completely arbitrarily, define a direction for this dot product,⁴ and thus make it a vector as well, to the best of my knowledge such a quantity does not have any uses in physics, so we'll leave it alone and treat it only as a number (scalar).

Alternatively, we know that the largest area possible between two vectors occurs when they are perpendicular to each other, where the area is $AB$ (you can also see this from (3)). If we are interested in the maximal ``amount perpendicular'' we can write

$(\text{amount perpendicular})_{max}^2=(AB)^2,$

(15)

where they are squared to take care of sign problems. Now, when they are completely parallel there is no area, and we're left only with non-area, which, also, can't be larger than the total maximum area, so

$(\text{amount parallel})_{max}^2=(AB)^2,$

(16)

as well.

Then using a rough analogue to the Pythagorean theorem we see that

$(\text{amount parallel})^2+(\text{amount perpendicular})^2=(\text{max total amount})^2$

$(\text{amount parallel})^2=(\text{max total amount})^2-(\text{amount perpendicular})^2$

$(\text{amount parallel})^2=(AB)^2-(AB\sin\theta)^2$

$(\text{amount parallel})^2=(AB)^2\left[1-\sin^2\theta\right]$

$(\text{amount parallel})^2=(AB)^2\cos^2\theta$

$(\text{amount parallel})=\pm AB\cos\theta,$

(17)

which, choosing the positive root, is the same as (13).

References

[1] F.W. Sears and M.Z. Zemansky. University Physics. Addison-Wesley, Reading, MA, 2nd edition, 1955.

[2] D. Halliday amd R. Resnick and J. Walker. Fundamentals of Physics. John Wiley & Sons, Inc., 7 edition, 2005.

[3] G.R. Fowles and G.L. Cassiday. Analytical Mechanics. Thomson Brooks/Cole, Belmont, CA, 7th edition, 2005.

[4] J.R. Reitz, F.J. Milford, and R.W. Christy. Foundations of Electromagnetic Theory. Addison-Wesley, 4th edition, 1992.

[5] D.J. Griffths. Introduction to Quantum Mechanics. Pearson Prentice Hall, 2nd edition, 2005.

[6] J. Stewart. Multivariable Calculus. Brooks/Cole Publishing Company, Pacific Grove, CA, 4th edition, 1999.

[7] W. Lewin. Lec 3 | 8.01 Physics I: Classical Mechanics, Fall 1999 [online]. Available from:http://www.youtube.com/watch?v=fwNQKjTj-0w#t=13m45s [cited 16 March 2009].

[8] C.T.J. Dodson and T. Poston. Tensor Geometry: The Geometric Viewpoint and its Uses. Springer, 2nd edition, 1997.

¹ Of course, not all first semester physics students even know what a determinant is, but that is not my point.