Monday, November 30, 2009

Quaternions -- Part 1: How many?

[Click here for a PDF of this post with nicer formatting or see below]
The Setup
Quaternions seem to be one of the least understood mathematical things amongst physicists. I have sat in countless lectures where at some point the lecturer pointed out that a particular topic could be understood or explained using quaternions, but, when pressed, could not really explain what, precisely, one of these quaternion thingies actually is.

The first encounter people have with quaternions is generally after they learn about the complex plane and its relationship to the regular 2D Cartesian plane. After seeing all sorts of nifty properties and uses of this relationship (we'll see one shortly) it's only natural to ask if there's a 3D complex analogue to the 2D complex plane. And, therefore, most books ask precisely this question. However, they usually give less-than-satisfactory attempts at generalizing, highlighting the mysterious algebraic problem of ``closure'' or something to that effect.

Then, often retelling the story of Hamilton and a bridge1 they pull some strange, ``4D'' quaternions out of a hat and show how they happily resolve all the algebraic problems. This, it seems, should be enough to placate even the most thoughtful reader, and stands in place of an actual explanation. And even though there is a lot of information about these buggers out there on the intertubes, all that I've seen is of the same approach.

So, it doesn't surprise me, honestly, that ``quaternion'' is also one of the most popular searches on this blog. The topic of quaternions is really too big to handle fully in one post (and, for full disclosure, I do not completely understand them myself), so this post will deal primarily with a rationale for the initial guess of a ``4D'' quaternion.

This post assumes you have read, and thoroughly grokked my discussion of dot and cross products,[1] and have a solid understanding of traditional complex numbers.


Complex Numbers and 2D Vectors
My approach in this section is based on the fantastic book, Visual Complex Analysis by Tristan Needham.[2] If something here isn't clear (and it's not the fault of my writing), or is different from the way you learned complex numbers, read this book. Even if everything is perfectly clear, read this book. What I'm trying to say is: Read this book.2

Recall that a complex number can be represented by a vector in a 2D -plane. Also, if -- i.e. the ``modulus'' of or the length of -- and is the ``argument'' of or the angle between and the -axis, we can also write in ``polar form.'' See [2] for pictures. We also have Euler's identity

(1)

Furthermore, recall that multiplying two complex numbers together effects a rotation and scaling. For example, multiplying a complex number -- graphically, a vector of length making angle with respect to the real axis (-axis) -- by gives . This can be understood graphically as a scaling of by and a rotation of the direction of by the angle . Finally, the complex conjugate of is given either by or .


From Complex Multiplication to Vector Products
For two complex numbers and let's see what is. This demonstration (at least initially) is based on [2].3 Anyway,


(2)

Graphically this is a vector with length at an angle from the -axis. Expanding this into a real and complex part using Euler's identity (1) gives:

(3)
We now note that the real part of this expression corresponds to the dot product between the two vectors
and . But should we do with the imaginary part?

Well, the magnitude of the imaginary part certainly corresponds to the magnitude of one dimension of the cross product between the two vectors. That is, if we relate the complex plane to the Cartesian -plane then the imaginary part of is the -component of . This important point is often lost in passing, and thus this property of complex multiplication is relegated to the realm of ``cool trick.'' However, we'll make good use of this detail.


Rethinking complex numbers
Now we are ready for the conceptual jump. Although we got to the representation of dot and cross products through use of a 2D complex plane, we're going to distance ourselves from this wonderful visualization for the moment and note that an arbitrary complex number has two parts: One corresponds to a dot product, the other corresponds to one dimension of a cross product of two vectors.[3]4 If we want to find a relationship between complex numbers and 3D vectors we need to pick one of these parts to generalize.

Now, recall that the dot product yields a scalar quantity equal to the amount that two vectors point in the same direction. Since there is no directionality or dimensionality inherent in this quantity -- it's just a length -- there's really no way to add extra bits here. Length stays a scalar in any dimension.

So, instead we turn to the cross-product part. In the preceding section I repeatedly stressed that the imaginary part of corresponds to one dimension of a 3D cross product. However, which single dimension of the cross product we choose is completely arbitrary: Just as with the calculation of area for the cross product, the 2D Cartesian plane we choose to map to the complex plane could just as easily be the -, - or the -planes.

Recall, that to resolve this ambiguity in in cross-product land we chose to identify which plane we were talking about by a right-hand rule normal vector to the plane. However, here we're attempting to generalize complex numbers, not cross products per se. So, instead of assigning different normal vectors to each cross product term, let's assign a different complex number to each term. That is, and , but for example. Then, we assign to the cross product of two vectors in the -plane and to the cross product of two vectors in the -plane.

The one question remaining, though, as we generalize our complex plane, is how many additional complex numbers do we need? Maybe, naively, we can try adding just one extra cross-product dimension. That is, and only. The problem, though, can be seen in cross-product land.


Closure
Remember, that a cross product resultant vector is a normal vector to an arbitrary plane in 3D Cartesian space, and thus always requires all three unit vectors , and . For example, the cross product is . That is, in order to make sense of cross which can exist in 2D, you must already have a third unit vector .

Physically, in Cartesian vector space, it means that you must be able to add any arbitrary 3D cross product resultant vectors and still get a 3D vector. In fact, if this wasn't true, there'd be no way to even write the 3D cross product in the first place since you need to project the arbitrary vectors to three (independent) 2D planes and then add the resulting normal vectors. You can't have just two cross-product parts and get a result that always makes sense. This is the requirement of ``closure.''

The reason there's no problem in the 2D plane version is simply because there's only one possible normal vector, so we only look at the magnitude of the cross product -- i.e. the amount of area -- and the sign. And that is just a scalar! In 2D land nothing is preventing you from adding the cross product to the dot product -- they're both scalars -- so you can write a two-element complex number combination with no trouble.

However, in 3D we can't simply add a vector to a scalar, and therefore we need all three parts of the cross product. So too, then, if we want a generalized complex number to have a dot-product part and a cross-product part that makes physical sense, we need three complex numbers: and from above, plus a , corresponding to the cross product of the projection of vectors in the -plane.

Thus, we now have a generalized complex number -- quaternion -- of the form

(4)


References
[1] E. Lansey. The dot product and cross products [online]. April 2009. Available from: http://behindtheguesses.blogspot.com/2009/04/dot-and-cross-products.html.
[2] Tristan Needham. Visual Complex Analysis. Oxford University Press, Oxford, UK, 1997.
[3] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.



1 Just Google it, it's not really worth retelling, in my opinion.
2 If I was stranded on an island forever but could bring only one math book, this would be it.
3 Just go out and get that book already! What are you waiting for?
4Thanks, Peeter, for recommending the book ([3]) which highlighted this point.


Wednesday, September 30, 2009

Nothing new

Unless I miraculously complete 3 problems sets, find the errors in the calculations I've been working on for 4 months and the bugs in the code based on those calculations, and mow the lawn by a reasonable hour today, there will likely not be a new Behind the Guesses post of substance this month.

However, I am considering a new method of posting, using Google Document Viewer, rather than converting LaTeX into pictures, etc. This is the last post, Noncommuting Rotation and Angular Momentum Operators, using the new method. Would you prefer this, or the old way? Or do you just download the PDF, and it makes no difference?


Monday, August 31, 2009

Noncommuting Rotation and Angular Momentum Operators

[Click here for a PDF of this post with nicer formatting]
The Setup
Avi Ziskind1 asked me to cover non-commuting operators in quantum mechanics, specifically why angular momentum operators do not commute. He pointed out that Griffiths [1] gives an intuitive argument for understanding why position and momentum operators do not commute but does not present any rationale given for why the different components of angular momentum have the commutation relation

(1)
Additionally, Schwabl [2], for example, defines the angular momentum operator, presents the commutation relations, and at least attempts (I think) to show (in a post-facto way) why they should have such relations. Likewise, in a related (as we'll see) problem, Goldstein, et. al. [3] discuss the commutation relations of generators of rotation without any physical argument.

However, both Sakurai [4] and Landau and Lifshitz [5], to some degree, present physical rationales for these relations. Landau and Lifshitz derive the notion of angular momentum in quantum theory quite nicely, and succinctly, but do not argue for why the commutation relations should hold. Sakurai develops a set of commutation relations independently of QM (as I will, shortly), but, I feel, bridges the gap to angular momentum rather poorly.

This post assumes familiarity with the ``generator of transformation'' ideas in [6].


The Generator of Rotation
(a) In 3D.

(b) The projection of (a) onto the -plane.
Figure 1: The rotation of a vector around the -axis.

In a previous post I covered the notion of ``generators of transformations,''[6] and claimed, as an example, that the ``generator of rotation'' is the angular momentum. Actually, I was getting ahead of myself there, and the statement in that context was not entirely correct. As I did not derive this result in that post, I will now, and will hopefully clear things up.

Suppose we have a function and we want to rotate it in space around the axis through some angle to . To do this, we'll find an ``angle rotation'' operator , which, when applied to , gives . That is,

(2)
The shift in coordinates can be derived from regular vector analysis, see Fig. 1 and Ref. [7], applied inside the arguments of the function.

Now the tricky part -- the Taylor expansion. Unlike the last time where the translated function had a simple argument, here we have inside sines and cosines. Since I'm really too lazy to do this expansion by hand I had Mathematica do it for me (click to see full-size):
(3)
In Mathematica's notation, raised to those parenthetical powers denotes partial derivatives. Say, means , for example. This expression is a bit of a mess, but we are not completely lost. From our discussion in the beginning of [6], we know that at least one similar operator takes an exponential form. So, we'll guess that here, as well, our operator will take an exponential form. We just need to process the mess of (3) to find that hidden exponential.

The first two terms in the series give us hope. They can be written as

(4)
which are, indeed, what we would expect to see at the beginning of an exponential expansion, where is the generator. Now we check that this keeps up for higher powers.

Continuing with the quadratic term, let's see if we can write as which would be the next term in an exponential. We check:





(5)
which does match the mess for the term in the expansion (3). You can verify on your own that this pattern continues in the higher powers.

Thus we conclude that

(6)
where we now identify

(7)
as the generator of the rotation.

This generator is the -component of the cross product

where and . Thus, we can simplify

(8)
If we carry through these same calculations for rotations around the or axes (try it yourself!) we get similar generators

(9)

(10)
This allows us to write the rotation operator for a rotation around an arbitrary axis , as

(11)
where for

(12)
is the generator of the transformation.


Commutators in general
In general, rotations do not commute. That is, rotating an object first around the -axis and then around the -axis will give a different result than rotating in the opposite order. You can convince yourself of this by the ultimate hand-waving argument2 -- twist your hand around different axes in different orders. Or see Fig. 2.

We'd like to find a way to quantify the difference between applying the rotations in different order, but, for the sake of generality, we'll discuss this for any two arbitrary operators and . The most natural way to quantify a difference is to look at, well, the difference. That is, if these operators act on a vector , we'd like to know what

(13)
is. This difference (for linear operators) does not depend on the particular vector , so we'll define the commutator of two operators as

(14)
Thus, a commutator of two operators is another operator which enacts this difference. If the order of operator application does not affect the end result the commutator is 0, and the operators are said to ``commute.''

In quantum mechanics, the issue of non-commuting operators is closely tied to the problem of measurement and the uncertainty principle. For example, if I have a state and I want to measure the position I apply the position operator . Likewise, if I want to measure the momentum I apply the momentum operator . However, in quantum mechanics, the order of taking these measurements affects the results, such that , for example. However, the applicability of commutators is not relegated only to quantum mechanics.


Commutators for rotation
This brings us back to our original question of the commutator of rotations. Because any two rotations through arbitrary angles, done in opposite orders give drastically different results depending on the angles, we'll consider rotations through small angles , such that we can approximate (11) by the first two terms in the expansion:

(15)
This simpler expression makes calculating the commutator much simpler. For rotations around and , the commutator depends only on the commutator of the generators .3 This commutator is the generator of the transformation for ``the difference between the order of the rotations.'' That is

(16)
where is the parameter for this transformation. Then, just as any rotation can then be built up from repeated applications of the generator (as in that exponential), the commutators for larger angles can be built up from repeated applications of the commutators of the generators.

Figure 3: Graphical commutator of . Blue vector is application of either or . Red is further application of to and green is further application of to . Brown is difference between the two.

For ease of illustration, we'll consider small rotations around the - and -axes (i.e. and ). There are two ways to find the commutator . One way is by brute force calculation which I encourage you to try on your own (use the expressions for (9) and (10)). However, I prefer showing it graphically, see Fig. 3. Starting with a vector in the -plane, we apply a small rotation around . This directs the vector upwards (blue in the picture). Then we apply another small rotation around , which directs the vector along the red line.

If we start with the same vector, and apply a small rotation around , the vector follows the blue line again. However, when we then rotate around , the vector veers off in the opposite direction at the same rate. The difference between the red and green vectors, as well as that difference added to the initial vector is shown in brown. The picture illustrates that

(17)
i.e. the generator of rotation around the -axis. Similar relationships

(18)
hold for other permutations of .


Angular momentum
Looking back at the expression for the generator of rotations (12), we see that we can re-write this in terms of the momentum operator

(19)
in quantum mechanics:


(20)
where we call the ``quantum mechanical angular momentum'' operator.4 Flipping this around to solve for in terms of :

(21)
In other words, the quantum mechanical angular momentum is the same (up to a constant) as the generator of rotations. Thus, the reason that quantum angular momentum has commutation relations (1) is due to the fact that it's simply a generator of rotation masquerading as a quantum mechanical operator.


References
[1] D.J. Griffths. Introduction to Electrodynamics. Pearson Prentice Hall, 3rd edition, 1999.
[2] F. Schwabl. Quantum Mechanics. Springer, 3rd edition, 2005.
[3] H. Goldstein, C. Poole, and J. Safko. Classical Mechanics. Cambridge University Press, San Francisco, CA, 3rd edition, 2002.
[4] J.J. Sakurai. Modern Quantum Mechanics. Addison-Wesley, San Francisco, CA, revised edition, 1993.
[5] L.D. Landau and E.M. Lifshitz. Quantum Mechanics. Butterworth-Heinemann, Oxford, UK, 3rd edition, 1977.
[6] E. Lansey. The Schrodinger Equation -- Corrections [online]. June 2009. Available from: http://behindtheguesses.blogspot.com/2009/06/schrodinger-equation-corrections.html.
[7] D.C. Lay. Linear Algebra and Its Applications. Addison-Wesley, Reading, MA, 3rd edition, 2003.
[8] C.T.J. Dodson and T. Poston. Tensor Geometry: The Geometric Viewpoint and its Uses. Springer, 2nd edition, 1997.



1 Everyone congratulate him on the birth of a son!
2 Borrowing a joke from Dodson and Poston, [8]
3 If this isn't obvious, work it out for yourself. Hint: The identity operator 1 commutes with everything.
4 There are better arguments (see [5]) using symmetry for why should actually be the angular momentum, not just called it, as I've argued, but they require much more talking. And this post is long enough already.

Wednesday, July 29, 2009

Transverse Electric and Magnetic Fields in a Waveguide

[Click here for a PDF of this post with nicer formatting]
The Setup

Figure 1: An example of a section cylindrical waveguide with embedded coordinate axes.

A conducting waveguide is a metal tube -- think pipe or air conditioning duct, for example -- through which electromagnetic waves can propagate. If you want to know what real-life waveguides look like, just do a quick internet image search. We'll assume the length of the tube is oriented along the -direction, see Fig 1. There is no loss of generality in doing this, since we can always choose a coordinate system as we like. So really, we're picking a coordinate system such that the -axis points along the tube.

Now, we can decompose the electric field and magnetic (inductance) field vectors into two parts each. One part points along the (normal) direction while the other is pointing somewhere in the (transverse) plane. Explicitly:

(1a)

(1b)

In the first([1], Eq. (8.24)) and third[2], Eq. (8.26)) editions of Classical Electrodynamics, J.D. Jackson gives the transverse fields in terms of the -components of the fields. (I have no idea why he left the complete expression out of the second edition.) In the third edition, for example, he assumes plane wave propagation in the positive direction -- that is an dependance -- and simply states, without any real explanation:
the transverse fields are

where I've converted his new choice of MKSA units back into the clearer CGS units. However, back in the first edition he does not insist on the assumption of positive propagation. Moreover, he does not just state the fields; he suggests a method for getting them -- namely, manipulation of the curl equations in Maxwell's equations. However, in that edition, he does not expand the curl equations in light of the separation of the fields into transverse and parallel components as he does in the second and third editions.

Because of all this confusion, I'm going to derive the cavity modes fully, starting from Maxwell's equations, once and for all. This derivation is based on a combination of all three editions of Jackson's book. This is a tedious, although not completely trivial exercise. Brace yourselves for quite a bit of algebra.


Maxwell's Equations - The Curls
Here we'll deal with the two curl equations in Maxwell's equations:

(2a)

(2b)
where is the magnetic field and is the electric displacement field. We will assume the inside of the waveguide has uniform permittivity and permeability, so and . Also, we'll assume the absence of any currents, so and we'll drop it from here on. Additionally, we'll assume the same sinusoidal time dependance for both the fields. Thus, the time derivatives ``bring down'' a factor of .

Furthermore, since we're splitting up and into normal and transverse parts, we'll do the same with the gradient operator :


Because curl equations are annoying, and because we're ultimately looking for an equation for the transverse fields, I'm going to try and get rid of the 's. The symmetry of form in (2) means that we'll only need to do these calculations once; I will use in place of either or .
First, we'll expand :


(3)
We've killed one term through this expansion. However, the leftmost cross product term gives a quantity with only a component. The righthand side of these equations also have a term. We can get rid of both by multiplying the entire equation(s) by :


(4)

Figure 2: Vectors , and .

For why see Fig. 2. Also, we note that

(5)
for the same reason. We could have used the vector multiplication identity

to simplify both of these expressions, or expanded and and carried through even more algebra, but I think the picture is clearer.

Thus,

(6)
and we can write (2) as

(7a)

(7b)

At this point, it's time to introduce the explicit dependence and process the derivatives.


Some and notes
Unlike Jackson, who works with the assumption of upward propagating waves -- i.e. an dependence -- we'll work with an assumed dependance, thus allowing both upward and downward propagating waves. Thus, the derivatives ``bring down'' a factor of . Whenever we have or the upper symbol is the sign for upward propagating waves, the lower symbol is for downward propagating. Because we'll be mucking about with these plus-minus guys in some algebra, I want to get a few issues out of the way.

The first thing to keep in mind about these plus-minus operators is that an equation like

(8)
is shorthand for two different equations:

(9a)

(9b)

So, there are essentially two ways to approach these things. One way is to carefully trace at the outset what happens to or under various arithmetic operations like addition, multiplication, etc. This has the benefit of being more concise -- you only need to write each equation once -- but is a lot easier to make errors and hides the double-equation nature of the symbol. I'll admit, though, that when I'm writing a paper I'm generally inclined to take this path.

However, for the purposes of this blog post, I'll explicitly carry out the calculations in parallel equations. (This really looks much better in the PDF. If anyone has any suggestions for improving the web version, please, let me know!) The left-hand column corresponds to , the right-hand column to . At the end I will also show what the results looks like in the shorthand notation and I encourage you to work out the rules on your own. Perhaps in another post I'll address the shorthand notation in detail.


Some more algebra
Now, it's time for some more algebra.1 Taking the derivative in (7) gives:








|


(10)
and








|


(11)

Solving (10) for gives








|


(12)
Substituting this into (11) and simplifying:


















|



|



|


(13)

Solving this for gives:








|


(14)
Or, in form:

(15)

In the first edition, Jackson converts the back into to get rid of the , but I feel this confuses things, as this expression only holds for a plane wave in the direction. In any case, we now substitute this expression for back into (12) and simplify:























|



|



|



|


(16)
Or, in form:

(17)

So, we've finally achieved Jackson's result, allowing for both upward and downward propagating waves.


References
[1] J.D. Jackson. Classical Electrodynamics. John Wiley & Sons, Inc., 1st edition, 1966.
[2] J.D. Jackson. Classical Electrodynamics. John Wiley & Sons, Inc., 3rd edition, 1998.




1 In case you were wondering why Jackson left out the whole calculation...

Tuesday, June 30, 2009

Derivative and Integral of the Heaviside Step Function

[Click here for a PDF of this post with nicer formatting]
The Setup
(a) Large horizontal scale

(b) ``Zoomed in''
Figure 1: The Heaviside step function. Note how it doesn't matter how close we get to
the function looks exactly the same.

The Heaviside step function , sometimes called the Heaviside theta function, appears in many places in physics, see [1] for a brief discussion. Simply put, it is a function whose value is zero for and one for . Explicitly,

(1)
We won't worry about precisely what its value is at zero for now, since it won't effect our discussion, see [2] for a lengthier discussion. Fig. 1 plots . The key point is that crossing zero flips the function from 0 to 1.


Derivative -- The Dirac Delta Function
(a) Dirac delta function

(b) Ramp function
Figure 2: The derivative (a), and the integral (b) of the Heaviside step function.
Say we wanted to take the derivative of . Recall that a derivative is the slope of the curve at at point. One way of formulating this is

(2)
Now, for any points or , graphically, the derivative is very clear: is a flat line in those regions, and the slope of a flat line is zero. In terms of (2), does not change, so and . But if we pick two points, equally spaced on opposite sides of , say and , then and . It doesn't matter how small we make , stays the same. Thus, the fraction in (2) is


(3)
Graphically, again, this is very clear: jumps from 0 to 1 at zero, so it's slope is essentially vertical, i.e. infinite. So basically, we have

(4)
This function is, loosely speaking, a ``Dirac Delta'' function, usually written as , which has seemingly endless uses in physics.

We'll note a few properties of the delta function that we can derive from (4). First, integrating it from to any :



(5)
since . On the other hand, integrating the delta function to any point greater than :



(6)
since .

At this point, I should point out that although the delta function blows up to infinity at , it still has a finite integral. An easy way of seeing how this is possible is shown in Fig. 2(a). If the width of the box is and the height is , the area of the box (i.e. its integral) is , no matter how large is. By letting go to infinity we have a box with infinite height, yet, when integrated, has finite area.


Integral -- The Ramp Function
Now that we know about the derivative, it's time to evaluate the integral. I have two methods of doing this. The most straightforward way, which I first saw from Prof. T.H. Boyer, is to integrate piece by piece. The integral of a function is the area under the curve,1 and when there is no area, so the integral from to any point less than zero is zero. On the right side, the integral to a point is the area of a rectangle of height 1 and length , see Fig. 1(a). So, we have

(7)
We'll call this function a ``ramp function,'' . We can actually make use of the definition of and simplify the notation:

(8)
since and . See Fig. 2(b) for a graph -- and the reason for calling this a ``ramp'' function.

But I have another way of doing this which makes use of a trick that's often used by physicists: We can always add zero for free, since . Often we do this by adding and subtracting the same thing,

(9)
for example. But we can use the delta function (4) to add zero in the form

(10)
Since is zero for , the part doesn't do anything in those regions and this expression is zero. And, although at , at , so the expression is still zero.

So we'll add this on to :





(11)
where the last step follows from the ``product rule'' for differentiation. At this point, to take the integral of a full differential is trivial, and we get (8).


References
[1] M. Springer. Sunday function [online]. February 2009. Available from: http://scienceblogs.com/builtonfacts/2009/02/sunday_function_22.php [cited 30 June 2009].
[2] E.W. Weisstein. Heaviside step function [online]. Available from: http://mathworld.wolfram.com/HeavisideStepFunction.html [cited 30 June 2009].



1 To be completely precise, it's the (signed) area between the curve and the line .

Thursday, June 4, 2009

The Schrödinger Equation - Corrections

[Click here for a PDF of this post with nicer formatting]
In my last post, I claimed
Additionally, we can extend from here that any quantum operator is written in terms of its classical counterpart by
Peeter Joot correctly pointed out that this result does not follow from the argument involving the Hamiltonian. While it is true that
any arbitrary unitary transformation, , can be written as
where is an Hermitian operator,
the relationship between a classical and its quantum counterpart is not as straightforward as I claimed. In reality, we can only relate the classical Poisson brackets to the quantum mechanical commutators, and we must work from there. Perhaps I will discuss this further in a later post.

In any case, though, the derivation of the Schrödinger equation only makes use of the relationship between the classical and quantum mechanical Hamiltonians, so the remainder of the derivation still holds. I am leaving the original post up as reference, but the corrected, restructured version (with some additional, although slight, notation changes) is below.


A brief walk through classical mechanics
Say we have a function of and we want to translate it in space to a point , where need not be small. To do this, we'll find a ``space translation'' operator which, when applied to , gives . That is,

(1)
We'll expand in a Taylor series:


(2)
which can be simplified using the series expansion of the exponential1 to

(3)
from which we can conclude that

(4)
If you do a similar thing with rotations around the -axis, you'll find that the rotation operator is

(5)
where is the -component of the angular momentum.

Comparing (4) and (5), we see that both have an exponential with a parameter (distance or angle) multiplied by something ( or ). We'll call the something the ``generator of the transformation.'' So, the generator of space translation is and the generator of rotation is . So, we'll write an arbitrary transformation operator through a parameter

(6)
where is the generator of this particular transformation.2 See [1] for an example with Lorentz transformations.


From classical to quantum
Generalizing (6), we'll postulate that any arbitrary quantum mechanical (unitary) transformation operator through a parameter can be written as

(7)
where is the quantum mechanical version of the classical operator . We'll call this the ``quantum mechanical generator of the transformation.'' If we have a way of relating a classical generator to a quantum mechanical one, then we have a way of finding a quantum mechanical transformation operator.

For example, in classical dynamics, the time derivative of a quantity is given by the Poisson bracket:

(8)
where is the classical Hamiltonian of the system and is shorthand for a messy equation.[2] In quantum mechanics this equation is replaced with

(9)
where the square brackets signify a commutation relation and is the quantum mechanical Hamiltonian.[3] This holds true for any quantity , and is a number which commutes with everything, so we can argue that the quantum mechanical Hamiltonian operator is related to the classical Hamiltonian by

(10)


Time translation of a quantum state
Consider a quantum state at time described by the wavefunction . To see how the state changes with time, we want to find a ``time-translation'' operator which, when applied to the state , will give . That is,

(11)
From our previous discussion we know that if we know the classical generator of time translation we can write using (7). Classically, the generator of time translations is the Hamiltonian![4] So we can write


(12)
where we've made the substitution from (10). Then (11) becomes

(13)

This holds true for any time translation, so we'll consider a small time translation and expand (13) using a Taylor expansion3 dropping all quadratic and higher terms:

(14)
Moving things around gives

(15)
In the limit the right-hand side becomes a partial derivative giving the Schrödinger equation

(16)

For a system with conserved total energy, the classical Hamiltonian is the total energy

(17)
which, making the substitution for quantum mechanical momentum and substituting into (19) gives the familiar differential equation form of the Schrödinger equation

(18)


References
[1] J.D. Jackson. Classical Electrodynamics. John Wiley & Sons, Inc., 3rd edition, 1998.
[2] L.D. Landau and E.M. Lifshitz. Mechanics. Pergamon Press, Oxford, UK.
[3] L.D. Landau and E.M. Lifshitz. Quantum Mechanics. Butterworth-Heinemann, Oxford, UK.
[4] H. Goldstein, C. Poole, and J. Safko. Classical Mechanics. Cambridge University Press, San Francisco, CA, 3rd edition, 2002.



1
2 There are other ways to do this, differing by factors of in the definition of the generators and in the construction of the exponential, but I'm sticking with this one for now.
3 Kind of the reverse of how we got to this whole exponential notation in the first place...

Tuesday, May 26, 2009

The Schrödinger Equation

Update: A corrected and improved version of this post is now up: http://behindtheguesses.blogspot.com/2009/06/schrodinger-equation-corrections.html

[Click here for a PDF of this post with nicer formatting]
notElon asked me to discuss, and to try and derive the Schrödinger equation, so I'll give it a shot. This derivation is partially based on Sakurai,[1] with some differences.

A brief walk through classical mechanics
Say we have a function of and we want to translate it in space to a point . To do this, we'll find a ``space translation'' operator which, when applied to , gives . That is,

(1)
We'll expand in a Taylor series:


(2)
which can be simplified using the series expansion of the exponential1 to

(3)
from which we can conclude that

(4)
If you do a similar thing with rotations around the -axis, you'll find that the rotation operator is

(5)
where is the -component of the angular momentum.

Comparing (4) and (5), we see that both have an exponential with a parameter (distance or angle) multiplied by something ( or ). We'll call the something the ``generator of the transformation.'' So, the generator of space translation is and the generator of rotation is . So, we'll write an arbitrary transformation operator through a parameter as

(6)
where is the generator of this particular transformation.2 See [2] for an example with Lorentz transformations.


From classical to quantum
In classical dynamics, the time derivative of a quantity is given by the Poisson bracket:

(7)
where is the classical Hamiltonian of the system and is shorthand for a messy equation.[3] In quantum mechanics this equation is replaced with

(8)
where the square brackets signify a commutation relation and is the quantum mechanical Hamiltonian.[4] This holds true for any quantity , and is a number which commutes with everything, so we can argue that the quantum mechanical Hamiltonian operator is related to the classical Hamiltonian by

(9)
specifically.

Additionally, we can extend from here that any quantum operator is written in terms of its classical counterpart by

(10)

So, using (4) the quantum mechanical space translation operator is given by

(11)
and, using (5), the rotation operator by

(12)
or, from (6) any arbitrary (unitary) transformation, , can be written as

(13)
where is (an Hermitian operator and is) the classical generator of the transformation.


Time translation of a quantum state
Consider a quantum state at time described by the wavefunction . To see how the state changes with time, we want to find a ``time-translation'' operator which, when applied to the state , will give . That is,

(14)
From our previous discussion we know that if we know the classical generator of time translation we can write using (13). Well, classically, the generator of time translations is the Hamiltonian![5] So we can write

(15)
and (14) becomes

(16)

This holds true for any time translation, so we'll consider a small time translation and expand (16) using a Taylor expansion3 dropping all quadratic and higher terms:

(17)
Moving things around gives

(18)
In the limit the righthand side becomes a partial derivative giving the Schrödinger equation

(19)

For a system with conserved total energy, the classical Hamiltonian is the total energy

(20)
which, making the substitution for quantum mechanical momentum and substituting into (19) gives the familiar differential equation form of the Schrödinger equation

(21)


References
[1] J.J. Sakurai. Modern Quantum Mechanics. Addison-Wesley, San Francisco, CA, revised edition, 1993.
[2] J.D. Jackson. Classical Electrodynamics. John Wiley & Sons, Inc., 3rd edition, 1998.
[3] L.D. Landau and E.M. Lifshitz. Mechanics. Pergamon Press, Oxford, UK.
[4] L.D. Landau and E.M. Lifshitz. Quantum Mechanics. Butterworth-Heinemann, Oxford, UK.
[5] H. Goldstein, C. Poole, and J. Safko. Classical Mechanics. Cambridge University Press, San Francisco, CA, 3rd edition, 2002.



1
2 There are other ways to do this, differing by factors of in the definition of the generators and in the construction of the exponential, but I'm sticking with this one for now.
3 Kind of the reverse of how we got to this whole exponential notation in the first place...