Small Perturbation - fun calculations
https://smallperturbation.com/taxonomy/term/15
enSteamed Hams But The Episodes Didn't Stay Good
https://smallperturbation.com/simpsons-decline
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>I'm not sure what I did on weekdays at 9pm during my earliest years but after 2000 the answer was simple. I was watching <i>The Simpsons</i> on <a href="https://en.wikipedia.org/wiki/The_Comedy_Network">The Comedy Network</a>. Most of my middle school friends were doing the same and this put us in a position to witness the end of an era. Reruns that aired every day seemed strikingly different from the new episodes that aired on Sunday and it did not take us long to figure out that the latter were steadily getting worse. The decline of <i>The Simpsons</i> is common knowledge but I recently <a href="http://digg.com/2017/charting-the-simpsons-decline">found a visualization</a> which puts it in a rather harsh perspective. This chart, by filmmaker <a href="https://en.wikipedia.org/wiki/Sol_Harris">Sol Harris</a>, points out among other things that <i>The Simpsons</i> has more than twice as many episodes as the next longest running cartoon. The writers would be able to take 14 years off to come up with good ideas again and still be able to call themselves the record holders. I would say the tail is so long that some viewers, who have already been watching for a number of years, were born after the last good episode.</p>
<p></p><center><br />
<a href="/sites/default/files/post_images/2019-04-11_simpsons_ratings.png"><img src="/sites/default/files/post_images/2019-04-11_simpsons_small.png" width="900" alt="A plot showing a rating out of 10 issued by Sol Harris for each 1989-2017 Simpsons episode." /></a><br />
</center>
<p>But what is the last good episode? Modulo some broken clocks that are right twice a day, Harris tells us that it is <a href="https://en.wikipedia.org/wiki/Treehouse_of_Horror_XII">the 12th Halloween special</a> which aired at the beginning of season 13. The episode afterwards called <a href="https://en.wikipedia.org/wiki/The_Parent_Rap">"The Parent Rap"</a> is scorned as "the definitive moment when the show went from 'Bad Simpsons' to 'Bad Television'". In the Harris ratings, which I have re-plotted above, the beginning of season 13 indeed looks like a turning point. A more satisfying exercise is to justify this with maximum likelihood estimation.</p>
<!--break--><p>This idea comes from a guy named Nathan Cunningham whose blog has some nice statistical gems. In <a href="http://www.nathancunn.com/2017-10-26-simpsons-decline/">Cunningham's analysis</a>, he uses ratings from IMDB to find the optimal partitioning into good and bad episodes. I think there are good reasons for repeating this with Harris' data set. First, changes to IMDB have broken the <a href="https://en.wikipedia.org/wiki/R_%28programming_language%29">R code</a> for scraping the site. Second, the detailed chart that Harris <a href="https://twitter.com/solmaquina/status/887635981972770816">posted on Twitter</a> was criticized for its low resolution. Transcribing it into a machine readable format in order to help those users is something I wanted to do anyway. And third, Sol Harris has expertise in this area. Presenting his opinions as the gospel shows a better understanding of humour than any humble warning about their subjectivity ever could. And so I present, <a href="/sites/default/files/post_images/2019-04-11_simpsons_data.csv">the spreadsheet</a> of the ratings. The first row should explain the format.</p>
<p></p><center>
<table>
<tr>
<td><b>Season</b></td>
<td><b>Number in Season</b></td>
<td><b>Number in Show</b></td>
<td><b>Title</b></td>
<td><b>Date</b></td>
<td><b>Rating</b></td>
<td><b>Superlatives</b></td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>1</td>
<td>The Simpsons Roasting on an Open Fire</td>
<td>1989-12-17</td>
<td>7</td>
<td>0</td>
</tr>
</table>
<p></p></center>
<p>Evidently, Sol Harris gave the first episode a rating of 7/10. A 0 in the superlatives column means that this was neither the best nor the worst episode of the season. These episodes are marked with 1 and -1 respectively. To get this data, I looked at the 2048x280 version of the chart and manually recorded each position hit by the line. Streaks with many consecutive episodes of the same rating were a little hard to discern. Nevertheless, one can use the fact that each episode takes up about 3 pixels of horizontal space. Coupled with the occasional checkpoints, marking the best / worst episodes and season boundaries, the ratings can be written down in all rows from 1 to 618. Doing this took some time but nowhere near the amount of time Sol Harris must have spent by suffering through hundreds of barely watchable episodes. It is now time to reveal the results from Cunningham's code.</p>
<p></p><center>
<table>
<tr>
<td></td>
<td><b>Last Good Episode</b></td>
<td><b>Season</b></td>
<td><b>Title</b></td>
</tr>
<tr>
<td><b>IMDB ratings</b></td>
<td>214</td>
<td>10</td>
<td>Wild Barts Can't Be Broken</td>
</tr>
<tr>
<td><b>Harris ratings</b></td>
<td>270</td>
<td>13</td>
<td>Treehouse of Horror XII</td>
</tr>
</table>
<p></p></center>
<p>I would say this is very much in line with my opinion — if I were to do a Simpsons binge watch, it would end somewhere in season 13. In discussions about where to draw the line, there are a few schools of thought. This is partly due to personal preference and partly because of the ambiguity in what exactly is being asked. Having the decline <a href="https://en.wikipedia.org/wiki/Jumping_the_shark">first become noticeable</a> is different from the final nail in the coffin. Season 9's <a href="https://en.wikipedia.org/wiki/The_Principal_and_the_Pauper">"The Principal and the Pauper"</a> is one of the most frequently criticized episodes, and for good reason. It tries to sell audiences on the idea that Seymour Skinner, one of the main characters outside the titular family, has been secretly committing identity theft for the last nine years. Cast member Harry Shearer, who voices Skinner, <a href="https://www.youtube.com/watch?v=KqFNbCcyFkk">pointed out</a> that this was a lazy story that would do nothing but alienate the audience.</p>
<p>Despite the obvious mistake that "The Principal and the Pauper" was, I think it's worth holding out until the brilliant <a href="https://en.wikipedia.org/wiki/Trilogy_of_Error">"Trilogy of Error"</a> in season 12. The vast <a href="https://deadhomersociety.com/manifesto/">Dead Homer Society</a> website admits that this episode is an exception even when it condemns the <a href="https://deadhomersociety.com/zombiesimpsons/zs1/">double-digit seasons</a> in general. Something else that I found within its miniature book is a plot of how often Homer takes on <a href="https://deadhomersociety.com/zombiesimpsons/zs10/">a throwaway new job</a>. There is indeed a sharp peak in season 10. I found a similar peak when counting the number of guest stars playing themselves. The rules for this are to combine members of performing groups and to specify that someone who appears two or more times in the same season is not counted again until the next season.</p>
<p></p><center><br />
<img src="/sites/default/files/post_images/2019-04-11_simpsons_actors2.png" width="600" alt="A plot showing that seasons 10 and 11 of The Simpsons have more celebrity cameos than previous seasons by a large margin." /><br />
</center>
<p>While doing background reading for this post, I found out that Julian Assange is actually one of these people who played himself <a href="https://en.wikipedia.org/wiki/At_Long_Last_Leave">in 2012</a>. And while writing this post, I found out that he was just denied asylum at the behest of Donald Trump and other enemies of a free press. I guess this answers the question of which will end first: <i>The Simpsons</i> or the stalemate surrounding the only political prisoner in Europe. These times call for reason, compassion and satire — the most effective medium for challenging authority. The necessary inspiration can come from all sorts of places and <i>The Simpsons</i> is high on my list... just make damn sure you stop watching after The Golden Years!</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/13">protest</a></div><div class="field-item odd"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item even"><a href="/taxonomy/term/27">must see</a></div></div></div>Thu, 11 Apr 2019 07:10:35 +0000root54 at https://smallperturbation.comSo I Felt Like Solving Zombie Dice
https://smallperturbation.com/zombie-dice-strategy
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><table>
<tr>
<td>
There is a party game called <a href="http://en.wikipedia.org/wiki/Zombie_Dice">Zombie Dice</a> where you roll dice and try to get as many points as you can without losing all of your health and dying. Green dice make points more likely than death, red dice make death more likely than points and yellow dice are neutral. I played it with some friends awhile ago and realized that it was probably simple enough for me to come up with the optimal strategy.
</td>
<td>
<img src="/sites/default/files/post_images/2013-06-08_zombie_dice.jpg" alt="Three of the dice from the game" />
</td>
</tr>
</table>
<p>Just about every game is solvable in principle, but it is very easy to make a game that would take longer than the age of the universe to solve. The prototypical solved game is <a href="http://xkcd.com/832/">tic-tac-toe</a>. The prototypical game that is still a long way from being solved is <a href="http://en.wikipedia.org/wiki/Endgame_tablebase">chess</a>. An <a href="http://xkcd.com/1002/">xkcd comic</a> lists a bunch of solved games so let's see how we can add Zombie Dice to this list.</p>
<!--break--><p>Now the rules of Zombie Dice are that you have a cup containing 13 dice and on your turn, you randomly choose dice to roll. You always roll three at a time and you can keep doing this until you die, run out of dice or choose to pass the cup on to the next person. There are three icons that can be rolled:</p>
<ul>
<li>
<b>Brains:</b> Each brain is a point. The possibility of more brains would encourage a person to keep rolling. Green dice have three, yellow have two and red have one.
</li>
<li>
<b>Shotguns:</b> Shotgun blasts lower your health by one, and if three of them are encountered in a turn, the turn ends with 0 points. The possibility of more shotguns would discourage one from continuing to roll. Green dice have one, yellow have two and red have three.
</li>
<li>
<b>Walkers:</b> Denoted by footprints, this result means that the die will be included in the next roll if you choose to roll again. Instead of drawing three new dice, you draw three minus the number of walkers from the last roll. All dice have two of these.
</li>
</ul>
<p>The cup begins with 6 green dice, 4 yellow dice and 3 red dice. The first player to get 13 points wins. This can be done in one turn but the probability is about one in a million. The need for some basic strategy is clear. You could end up with three green dice for your first roll and then roll them and get a lot of brains, which is good. But this makes the second roll more risky. The fact that three green dice have been used up means that the next dice are more likely to be yellow and red which increase your chances of being shot. This uncertainty about whether to roll is part of the fun of the game. It can be surmounted by using a formula for the expected number of points <img class="teximage" src="/sites/default/files/tex/e11216427be697feafab0bbd229289f1dd94ec60.png" alt="$ E $" />. When deciding whether to continue, a player should roll if <img class="teximage" src="/sites/default/files/tex/f68f64f77e15aa05432ba1d283fb2341e5ac47b0.png" alt="$ E > n $" /> and cash in if <img class="teximage" src="/sites/default/files/tex/193d2731fb1e2ad1aab151f05fe9ccec74d9f707.png" alt="$ E < n $" />.</p>
<p>The game state is a function of a few different variables. First, you need to know how many green dice are left <img class="teximage" src="/sites/default/files/tex/ef4462522b00ce703a7db401a426b3f3815b6c80.png" alt="$ N_g $" />, how many yellow dice are left <img class="teximage" src="/sites/default/files/tex/445eed3c3fcf6486e05f9320b93c683c15622afa.png" alt="$ N_y $" /> and how many red dice are left <img class="teximage" src="/sites/default/files/tex/1edb2a52045254c0b3797cdf289d3d1d991c5bc0.png" alt="$ N_r $" />. You also need to know your health <img class="teximage" src="/sites/default/files/tex/92bbc856f0d886eb1d6691336a4cffa24996aec6.png" alt="$ h $" /> and the number of points you already have <img class="teximage" src="/sites/default/files/tex/fe695a9389c9b0c07751fac97facfc61a7830b93.png" alt="$ n $" />. Also dictating the state of the game are the number of walkers you have and the colours of the dice on which they appear. The initial condition is <img class="teximage" src="/sites/default/files/tex/512bebf2b369b9a0610322fdff9ace762a3a0e83.png" alt="$ N_g = 6 $" />, <img class="teximage" src="/sites/default/files/tex/38b03c704370cd8384d5cf651bc21a443d12187e.png" alt="$ N_y = 4 $" />, <img class="teximage" src="/sites/default/files/tex/8c441ca6015d412794283eee5f884438adf98b33.png" alt="$ N_r = 3 $" />, <img class="teximage" src="/sites/default/files/tex/380d85f1826e29e7fe1d993c07c1a93f8e5ef9ea.png" alt="$ h = 3 $" /> and <img class="teximage" src="/sites/default/files/tex/0f97d39346f2d89d4b9e1892496098e4fc2e6dc6.png" alt="$ n = 0 $" /> with no walkers.</p>
<p>If a player rolls three walkers, the following roll would just be a re-roll so in this case</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/46ec5a2c84398ec32020b76ba6769de04bc23e5a.png" alt="\[<br /> E = E_{w_1, w_2, w_3} (n, h)<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Rolling two walkers on the other hand, requires the player to draw one die. This can either be a green, a yellow or a red die so there are three expectation values to consider. Moreover they must be weighted by the probability that the given die will be drawn so the overall expected number of points is</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/edba9f5778c3a381b5d438be3995b526963fd5e1.png" alt="\begin{align*}<br /> E &= P(g|N_g, N_y, N_r) E_{w_1, w_2, g} (n, h) + P(y|N_g, N_y, N_r) E_{w_1, w_2, y} (n, h) + P(r|N_g, N_y, N_r) E_{w_1, w_2, r} (n, h) \\<br /> &= \frac{1}{N_g + N_y + N_r} \,\, \left [ N_g E_{w_1, w_2, g} (n, h) + N_y E_{w_1, w_2, y} (n, h) + N_r E_{w_1, w_2, r} (n, h) \right ]<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where in the last step I have written in the expressions for how likely it is that a given die will be drawn. With one walker, there are two dice to be drawn leading to a weighted sum of six conditional expectations:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/3527f03b93463e79d8c157cdafddc36e3d97fc2f.png" alt="\begin{align*}<br /> E &= P(g, g|N_g, N_y, N_r) E_{w, g, g} (n, h) + P(y, y|N_g, N_y, N_r) E_{w, y, y} (n, h) + P(r, r|N_g, N_y, N_r) E_{w, r, r} (n, h) \\<br /> &+ P(g, y|N_g, N_y, N_r) E_{w, g, y} (n, h) + P(g, r|N_g, N_y, N_r) E_{w, g, r} (n, h) + P(y, r|N_g, N_y, N_r) E_{w, y, r} (n, h) \\<br /> &= \frac{1}{\binom{N_g + N_y + N_r}{2}} \, \left [ \binom{N_g}{2} E_{w, g, g} (n, h) + \binom{N_y}{2} E_{w, y, y} (n, h) + \binom{N_r}{2} E_{w, r, r} (n, h) \right \none \\<br /> &\left \none + N_g N_y E_{w, g, y} (n, h) + N_g N_r E_{w, g, r} (n, h) + N_y N_r E_{w, y, r} (n, h) \right ]<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>With no walkers and no dice pre-determined, we get ten terms in the sum:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/45f2a52bb9376bcc8834184b9f822c7520dec024.png" alt="\begin{align*}<br /> E &= P(g, g, g|N_g, N_y, N_r) E_{g, g, g} (n, h) + P(g, g, y|N_g, N_y, N_r) E_{g, g, y} (n, h) + P(g, g, r|N_g, N_y, N_r) E_{g, g, r} (n, h) \\<br /> &+ P(y, y, y|N_g, N_y, N_r) E_{y, y, y} (n, h) + P(y, y, g|N_g, N_y, N_r) E_{y, y, g} (n, h) + P(y, y, r|N_g, N_y, N_r) E_{y, y, r} (n, h) \\<br /> &+ P(r, r, r|N_g, N_y, N_r) E_{r, r, r} (n, h) + P(r, r, g|N_g, N_y, N_r) E_{r, r, g} (n, h) + P(r, r, y|N_g, N_y, N_r) E_{r, r, y} (n, h) \\<br /> &+ P(g, y, r|N_g, N_y, N_r) E_{g, y, r} (n, h) \\<br /> &= \frac{1}{\binom{N_g + N_y + N_r}{3}} \left [ \binom{N_g}{3} E_{g, g, g} (n, h) + N_y \binom{N_g}{2} E_{g, g, y} (n, h) + N_r \binom{N_g}{2} E_{g, g, r} (n, h) \right \none \\<br /> &\left \none + \binom{N_y}{3} E_{y, y, y} (n, h) + N_g \binom{N_y}{2} E_{y, y, g} (n, h) + N_r \binom{N_y}{2} E_{y, y, r} (n, h) \right \none \\<br /> &\left \none + \binom{N_r}{3} E_{r, r, r} (n, h) + N_g \binom{N_r}{2} E_{r, r, g} (n, h) + N_y \binom{N_r}{2} E_{r, r, y} (n, h) \right \none \\<br /> &\left \none + N_g N_y N_r E_{g, y, r} (n, h) \right ]<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Once we know the ten expected values, these four expressions allow us to plug in the game state and always know whether it makes sense to keep betting or fold. The expectations are tedious to calculate but I didn't come this far to back down now! Here is a list of them:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/1b39ea0ce63ac692b3855adec8496eb0e642fb0a.png" alt="\begin{align*}<br /> E_{g, g, g} (n, h) &= \left ( -\frac{5}{36} h^2 + \frac{55}{72} h - \frac{5}{108} \right ) n + \left ( -\frac{3}{16} h^2 + \frac{47}{48} h - \frac{1}{4} \right ) \\<br /> E_{g, g, y} (n, h) &= \left ( -\frac{67}{432} h^2 + \frac{127}{144} h - \frac{19}{72} \right ) n + \left ( -\frac{5}{54} h^2 + \frac{239}{216} h - \frac{1}{9} \right ) \\<br /> E_{g, g, r} (n, h) &= \left ( -\frac{1}{6} h^2 + \frac{71}{72} h - \frac{17}{36} \right ) n + \left ( -\frac{11}{48} h^2 + \frac{533}{432} h - \frac{17}{36} \right ) \\<br /> E_{y, y, y} (n, h) &= \left ( -\frac{1}{9} h^2 + \frac{7}{9} h - \frac{10}{27} \right ) n + \left ( -\frac{1}{6} h^2 + \frac{17}{18} h - \frac{1}{3} \right ) \\<br /> E_{y, y, g} (n, h) &= \left ( -\frac{5}{36} h^2 + \frac{31}{36} h - \frac{19}{54} \right ) n + \left ( -\frac{7}{36} h^2 + \frac{115}{108} h - \frac{5}{18} \right ) \\<br /> E_{y, y, r} (n, h) &= \left ( -\frac{1}{12} h^2 + \frac{25}{36} h - \frac{7}{18} \right ) n + \left ( -\frac{19}{108} h^2 + \frac{35}{36} h - \frac{1}{2} \right ) \\<br /> E_{r, r, r} (n, h) &= \left ( \frac{3}{8} h - \frac{1}{4} \right ) n + \left ( -\frac{1}{16} h^2 + \frac{7}{16} h - \frac{1}{4} \right ) \\<br /> E_{r, r, g} (n, h) &= \left ( -\frac{1}{12} h^2 + \frac{17}{24} h - \frac{5}{12} \right ) n + \left ( -\frac{19}{144} h^2 + \frac{13}{16} h - \frac{5}{12} \right ) \\<br /> E_{r, r, y} (n, h) &= \left ( -\frac{1}{24} h^2 + \frac{13}{24} h - \frac{1}{3} \right ) n + \left ( -\frac{7}{72} h^2 + \frac{5}{8} h - \frac{1}{3} \right ) \\<br /> E_{g, y, r} (n, h) &= \left ( -\frac{1}{8} h^2 + \frac{61}{72} h - \frac{4}{9} \right ) n + \left ( -\frac{13}{72} h^2 + \frac{221}{216} h - \frac{4}{9} \right )<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p></p><center><br />
<a href="https://www.zombeewatch.org/"><img src="/sites/default/files/post_images/2013-06-08_zombees.jpg" alt="A zombie bee card from the game Munchkin." /></a><br />
</center><br />
As an example I will repeat one of the calculations here. How about <img class="teximage" src="/sites/default/files/tex/13cbcc327b9f8b0af666d348c6686bb7d364b628.png" alt="$ E_{r, r, r} (n, h) $" />, the one with three red dice. Whenever you begin a roll with <img class="teximage" src="/sites/default/files/tex/fe695a9389c9b0c07751fac97facfc61a7830b93.png" alt="$ n $" /> points, there are five possibilities for the number of points you will have at the end of the roll: <img class="teximage" src="/sites/default/files/tex/605cd3669e47ee5258124d525008357fb243480a.png" alt="$ 0 $" />, <img class="teximage" src="/sites/default/files/tex/fe695a9389c9b0c07751fac97facfc61a7830b93.png" alt="$ n $" />, <img class="teximage" src="/sites/default/files/tex/1f82ce0985222e9d6f845e01e3630ad211b58140.png" alt="$ n+1 $" />, <img class="teximage" src="/sites/default/files/tex/0af88c61c56f2cba7a0abce376cf26f4a523fb96.png" alt="$ n+2 $" /> or <img class="teximage" src="/sites/default/files/tex/e56b6de6b253a50ca02d8afb76fa51aca5fb4146.png" alt="$ n+3 $" />. One definite thing we can say is that <img class="teximage" src="/sites/default/files/tex/da28d68111ec37b6517b3ea98040421e88e23350.png" alt="$ P(n+3|h) = \frac{1}{216} $" /> because in order to get three points, we need to roll three brains in a row. Every other probability depends on what the health is. First, let's solve for them when the player has three health.
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/923b92ac9489d91c08d2bcb7f5869ee9c55ea0e3.png" alt="\[<br /> P(0|h=3) = \left ( \frac{1}{2} \right )^3 = \frac{1}{8}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>because in order to die and lose all your points, you would have to roll three shotgun blasts in a row. In order to gain or lose no points, you have to roll three non-brains without dying.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/6806767067551a9dae43539e551605e75a4d2218.png" alt="\[<br /> P(n|h=3) = \left ( \frac{5}{6} \right )^3 - \frac{1}{8} = \frac{49}{108}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>The first product is the probability for rolling three non-brains in a row. This counts the possibility of three shotguns however which is why we subtract one eighth.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/58fd6c2c5340cc2a7c639bad455c509a8b69f059.png" alt="\[<br /> P(n+1|h=3) = 3 \left ( \frac{1}{6} \right ) \left ( \frac{5}{6} \right )^2 = \frac{25}{72}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>If we want one additional point, we are rolling one brain and two non-brains. We multiply by three because any one of the three red dice could've been the one that was a brain.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/8c595b2cfe8117fcd26921c9dce873aefef0f3e4.png" alt="\[<br /> P(n+2|h=3) = 3 \left ( \frac{1}{6} \right )^2 \left ( \frac{5}{6} \right ) = \frac{5}{72}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>For two additional points, we do the same thing except it's two brains and one non-brain. Plugging these into the formula for expected value,</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/d67c548727f76e17de180d6a76c951eedf345de8.png" alt="\begin{align*}<br /> E_{r, r, r} (n, 3) &= n \frac{49}{108} + (n+1) \frac{25}{72} + (n+2) \frac{5}{72} + (n+3) \frac{1}{216} \\<br /> &= \frac{7}{8}n + \frac{1}{2}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Moving on to two health,</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/ba63c9a1f9eafea16a0a68d7b465e91644d36083.png" alt="\[<br /> P(0|h=2) = \left ( \frac{1}{2} \right )^2 + 3 \left ( \frac{1}{2} \right )^3 = \frac{1}{2}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>we get an extra term in the probability of dying compared to three health. We could roll the three shotgun outcome from our last calculation but we also need to count the three ways of rolling two shotguns and one non-shotgun.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/80dd640e943a48906d88f2059ca67cd295e9fb00.png" alt="\[<br /> P(n|h=2) = \left ( \frac{1}{3} \right )^3 + 3 \left ( \frac{1}{3} \right )^2 \left ( \frac{1}{2} \right ) = \frac{11}{54}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Here, the first term is the probability of three walkers and the second is the probability of two walkers and a shotgun.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/f284b82d2618807b9755c5d12fd7c6283a5310a0.png" alt="\[<br /> P(n+1|h=2) = 3 \left ( \frac{1}{6} \right ) \left ( \frac{1}{3} \right )^2 + 6 \left ( \frac{1}{2} \right ) \left ( \frac{1}{3} \right ) \left ( \frac{1}{6} \right ) = \frac{2}{9}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This is brain-walker-walker plus brain-walker-shotgun. The six is there because there are six permutations we can apply to the three different values.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/ca37bb5e5bc992cb21d7010fca6f11f38f223016.png" alt="\[<br /> P(n+2|h=2) = 3 \left ( \frac{1}{6} \right )^2 \left ( \frac{1}{3} \right ) + 3 \left ( \frac{1}{6} \right )^2 \left ( \frac{1}{2} \right ) = \frac{5}{72}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This one is brain-brain-walker plus brain-brain-shotgun. Now we can plug these in again to get</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/485da9e2f51ca381b62264fd0c0f96cdab5db6da.png" alt="\begin{align*}<br /> E_{r, r, r} (n, 2) &= n \frac{11}{54} + (n+1) \frac{2}{9} + (n+2) \frac{5}{72} + (n+3) \frac{1}{216} \\<br /> &= \frac{1}{2}n + \frac{3}{8}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Notice that this roll puts you at a disadvantage unless you have zero points and nothing to lose. This is in contrast to the same roll at three health where you need to have four points before the roll becomes too dangerous. Now to deal with the case of one health,</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/c23b43df982b324ba19c5dfe324fe728c4e54b83.png" alt="\[<br /> P(0|h=1) = 1-\left ( \frac{1}{2} \right )^3 = \frac{7}{8}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Everything but three non-shotguns leads to death so we simply subtract this probability from unity.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/2d37ccc6ff45b7f4cdbc482088cebeca65cbd58f.png" alt="\[<br /> P(n|h=1) = \left ( \frac{1}{3} \right )^3 = \frac{1}{27}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>For this one, a single shotgun would kill us and a single brain would give us a point, so it's walker-walker-walker.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/30dbfdd3f457702b1edd73954934634369ece9ff.png" alt="\[<br /> P(n+1|h=1) = 3\left ( \frac{1}{6} \right ) \left ( \frac{1}{3} \right )^2 = \frac{1}{18}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This has to be brain-walker-walker for the same reason.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/4b859e5119e6101e49f8ac2ec1910bb4f7f0fe2c.png" alt="\[<br /> P(n+2|h=1) = 3\left ( \frac{1}{6} \right )^2 \left ( \frac{1}{3} \right ) = \frac{1}{36}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Similarly, this is brain-brain-walker. Now to plug in these numbers one more time,</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/b1275cc180cac275df38cfcf18f815be3eab357d.png" alt="\begin{align*}<br /> E_{r, r, r} (n, 1) &= n \frac{1}{27} + (n+1) \frac{1}{18} + (n+2) \frac{1}{36} + (n+3) \frac{1}{216} \\<br /> &= \frac{1}{8}n + \frac{1}{8}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Now we have three equations but we want to combine them into one that looks like <img class="teximage" src="/sites/default/files/tex/8448337933cbbb566e7d9e40f939c0f094675711.png" alt="$ E_{r, r, r} (n, h) = f(h)n + g(h) $" />. This is just a cheap trick so we will choose quadratics <img class="teximage" src="/sites/default/files/tex/f0ac7e760b0cdc40185ea3fb074f06497bf3495b.png" alt="$ f(h) = ah^2 + bh + c $" />, <img class="teximage" src="/sites/default/files/tex/974cbba4bd8f4e98e8cbb6afc9f7c93cd87ec883.png" alt="$ g(h) = xh^2 + yh + z $" />. The requirements <img class="teximage" src="/sites/default/files/tex/d6612cb86b057c149ae3b2d84207cd9c2fffb378.png" alt="$ f(3) = \frac{7}{8} $" />, <img class="teximage" src="/sites/default/files/tex/cb201a9773af5914ddfb08b12c5ed3aabde8a322.png" alt="$ f(2) = \frac{1}{2} $" />, <img class="teximage" src="/sites/default/files/tex/949c919ab900ed1afe8e3eaa2a5cf81b300bd5c4.png" alt="$ f(1) = \frac{1}{8} $" />, <img class="teximage" src="/sites/default/files/tex/2a0ad206e9fecd17464355b09304ee5f46e8947c.png" alt="$ g(3) = \frac{1}{2} $" />, <img class="teximage" src="/sites/default/files/tex/8a10da2fc0edbfb7a121138fbbda6d9a58a5d85d.png" alt="$ g(2) = \frac{3}{8} $" /> and <img class="teximage" src="/sites/default/files/tex/2e72a1f2ec33722db848cbf6d00c4ca06ec049d2.png" alt="$ g(1) = \frac{1}{8} $" /> reduce this to a system of linear equations. Solving them, we get <img class="teximage" src="/sites/default/files/tex/ad9d6433e4e066f1a36890daf2ecc08dbebe8c00.png" alt="$ (a, b, c) = (0, \frac{3}{8}, -\frac{1}{4}) $" /> and <img class="teximage" src="/sites/default/files/tex/57986310cf3e509b311ad931cfafa500989f6306.png" alt="$ (x, y, z) = (-\frac{1}{16}, \frac{7}{16}, -\frac{1}{4}) $" />. Of the ten tedious functions of <img class="teximage" src="/sites/default/files/tex/fe695a9389c9b0c07751fac97facfc61a7830b93.png" alt="$ n $" /> and <img class="teximage" src="/sites/default/files/tex/92bbc856f0d886eb1d6691336a4cffa24996aec6.png" alt="$ h $" />, this gives us the seventh one. The others are all obtained in similar ways.</p>
<p>Clearly by taking your first roll, you can only gain points or stay the same. Subsequent rolls only happen if you stand to benefit from them. Therefore a turn will give a player more than zero points on average. In other words, the game will <a href="http://en.wikipedia.org/wiki/Almost_surely">almost surely</a> end. A harder calculation would tell you how many turns this is likely to take. This is the same as asking what the expectation is for the number of points that can be accumulated in a single turn if this optimal strategy is used from start to finish. In fact one could get the entire distribution for the number of points accumulated. This is mainly hard because of the walkers. The events of rolling a certain outcome and choosing a certain set of dice for the next roll are not independent. Maybe we should simulate this. Sorry for switching between first, second and third person so often.</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item odd"><a href="/taxonomy/term/21">tuning performance</a></div></div></div>Sun, 09 Jun 2013 06:05:37 +0000root39 at https://smallperturbation.comShitty Airlines
https://smallperturbation.com/arguing-with-ads
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>It's the holiday season. And that means I get to be reminded of <a href="http://www.youtube.com/watch?v=DagVklB4VHQ">how logical</a> air travel is. One of the most annoying things is that I can't take everything as carry-on luggage. Don't get me wrong, I hardly bring anything. But two things I usually bring are a razor and shaving cream which security guards tend to take away. Because of that, the idea of checking in online saves no time at all. Here's what the screen should really say:</p>
<p><img src="/sites/default/files/post_images/2012-12-20_flight_flowchart.png" width="600" alt="A chart showing that you must be a female or a male who doesn't shave in order for the online checkin process to serve a useful purpose." /></p>
<p>Nevertheless, I got to the airport on time and was able to stay relatively occupied on the plane.</p>
<!--break--><p>Since I'm on the topic of airports, I decided to look into the bold claims made by HSBC ads. In case you haven't taken a plane in a long time (or were fortunate enough to board the plane by actually going outside), the gantries connecting planes to most major airports are <a href="http://www.foxtranslate.com/culture/hsbc-airport-ads-share-remarkable-insight-to-our-world">full of banners</a> that reveal an amazing "fact" about global markets. The idea is to show that HSBC is innovative because it sees great potential in the world. However, some of them are false and some are misleading.</p>
<p><b>The Halal industry is worth $3 trillion worldwide:</b> Halal is only considered a specialty food in Western nations. I took this to mean that $3 trillion worth of meat products are sold annually in the countries where a majority of the citizens are Muslim. But data from where I live convinced me that this is misleading. In 2009, <a href="http://www.cmc-cvc.com/english/industry_statistic_e.asp">$21.3 billion</a> came from meat sold in Canada. This means that the average meat-eater in Canada spent $655 on meat that year. In the United States, total meat revenue was <a href="http://www.meatami.com/ht/d/sp/i/47465/pid/47465/">$154.8 billion</a> that same year. This means that the average American carnivore spent $539. Vegetarian rates assumed were 4% and 6% respectively. We can now extrapolate these numbers and see how many Muslims would be needed to value the Halal industry at $3 trillion. Unless meat is <i>many times more expensive</i> in the rest of the world, we would need there to be between 4.6 and 5.6 billion Halal buyers out there. This number seems too high because less than 2 billion people follow Islam. It still seems too high when you consider that some countries like Indonesia probably have Halal food as the norm in every grocery store and sell it to people whether they follow Islam or not. <a href="http://en.islamtoday.net/artshow-233-4501.htm">An article about this</a> from South Africa clears it up for me. They estimate the value of Halal food at $160 billion per year worldwide. The estimate goes up to $2 trillion when they include "finance, pharmaceuticals, cosmetics, logistics and fashion" - things that I didn't even know had Halal versions.</p>
<p><b>Pakistan is the world's second largest exporter of clothing:</b> I don't dispute this fact but does it really show a bright future for Pakistan? It seems like this ad is trying to take a major problem and sweep it under the sweatshop produced rug. Textile makers in Pakistan earn <a href="http://www.werner-newtwist.com/en/site-vol-001/newtwist.htm">35 cents per hour</a> which makes them some of the poorest workers in the world. They aren't treated very well when they are doing the job either. Another country whose wage is in the bottom five is Bangladesh. A recent fire in one of their clothing factories <a href="http://www.nytimes.com/2012/12/18/world/asia/bangladesh-factory-fire-caused-by-gross-negligence.html">killed 112 people</a> because the company had enough disrespect to lock them inside for the duration of the shift. I watched an episode about this on <a href="http://en.wikipedia.org/wiki/The_Lang_and_O%27Leary_Exchange">The Lang and O'Leary Exchange</a>. Amanda Lang was saying that regulations should be in place to give these factory workers higher wages. Kevin O'Leary then called her a bleeding heart and said that this would accomplish nothing other than sending them all back to unemployment when corporations decide to get cheap labour elsewhere. I guess the only solution would be a concerted effort to have all of these countries raise their minimum wages at the same time.</p>
<p><b>At least $1.6 billion is lying down the back of US sofas:</b> I laughed and said "yeah, I bet there is a lot" when I walked by this one. But is there really so much that every American has lost $5 this way? Another British bank, Halifax, <a href="http://www.dailymail.co.uk/news/article-2248022/Sitting-small-fortune-Guitarist-finds-85-sofa-looking-plectrum.html">has done a survey about this</a> and estimated that the couch fortune is £42 million in Britain. Since the US has five times the population of Britain, I would expect it to have $342 million, not $1.6 billion. So until HSBC makes some effort to back up this number, I will call bullshit.</p>
<p><b>On average, Russian billionaires are 19 years younger than those in America:</b> I'm sure you could verify this by reading <i>Forbes</i> magazine. However, I suspected that it might have nothing to do with billionaires. Perhaps Russian <i>people</i> are 19 years younger than those in America. My favourite author <a href="http://sfwriter.com/">Robert J Sawyer</a> has written:</p>
<blockquote><p>
They were the first in space, they led the world in so much! And their literature, their music! But now it's a land of pestilence and poverty, of disease and early death - you would not want to visit it, trust me.
</p></blockquote>
<p>To test this, I looked at census data for <a href="http://www.census.gov/prod/cen2010/briefs/c2010br-03.pdf">both</a> <a href="http://www.gks.ru/doc_2009/bul_dr/chisl-pv09.zip">countries</a> and determined that the average ages differ by only six years: 32 for Russia and 38 for America. Therefore I was wrong and this fact is significant. Note that median ages for a country are widely quoted. I went with the mean this time because I figured that's what the original statement used.</p>
<p><b>The US has more Spanish language newspaper readers than Latin America:</b> This is just another way of saying that Latin America is poor compared to the US, right? That's what I thought at first because <a href="http://stateofthemedia.org/2011/newspapers-essay/data-page-6/">newspaper readership increases with income</a>. However, I was wrong again because this alone cannot explain it. The population of Spanish speakers in the US is about 50 million, whereas the population of Latin America minus Brazil is about 400 million. Therefore the difference in wealth would need to be a factor of 8 (actually it would have to be more because the readership-income curve is sublinear) to account for this change. If you look at the <a href="http://en.wikipedia.org/wiki/List_of_Latin_American_countries_by_GDP_(PPP)">GDP per capita</a>, it seems like you can only get a factor of 4 this way.</p>
<p><b>The amount of gold beneath the ocean could give everyone on Earth €100,000:</b> Give me a break. Bankers are smart enough to know that this would not make anyone richer. You could print money to give everybody €100,000. Or wait fifty years and the average salary in Canada will increase by this amount simply due to inflation. The amount of gold <a href="http://en.wikipedia.org/wiki/Gold_reserves">sitting in reserves</a> is about $2 trillion - enough to give everyone €200. So if you were to destroy many marine ecosystems in order to get at 500 times as much gold, I think this would just make gold 500 times less valuable.</p>
<p><b>Two thirds of the people who have ever reached 65 are alive today:</b> The anti-religious person in me gets concerned when I see <a href="http://www.forbes.com/sites/erikaandersen/2011/01/30/no-more-gold-watches-baby-boomers-and-retirement/">A</a> <a href="https://www.dominioninsurance.com/download.php/2216895/549158">ton</a> <a href="http://thebeaveronline.co.uk/2011/11/02/measured-musings-the-world-reaches-7-billion/">of</a> <a href="http://www.simonejoyaux.com/2012/07/an-interesting-tidbit/">Internet</a> <a href="http://www.urbanministry.org/jeremydelrio-two-thirds-people-who-ever-reached-65-alive-today-hsbc-ad-reason-why-social-security-ag">users</a> repeating this "interesting fact" and using it to reiterate their points without questioning it. Luckily, I found <a href="http://poetryofphysics.blogspot.ca/2011/12/two-thirds-of-people-who-have-ever.html">one blogger</a> who realizes that this HSBC ad is <i>not</i> a reliable source. In some countries, there is enough data to debunk this myth. Basic data from the Canadian census <a href="http://www12.statcan.gc.ca/census-recensement/2011/dp-pd/pyramid-pyramide/his/index-eng.cfm">goes back to 1921</a> on their website. Here is a chart of how many 65+ year-olds there were at 30 year intervals:</p>
<table border="2">
<tr>
<td>1921</td>
<td>420,244</td>
</tr>
<tr>
<td>1951</td>
<td>1,086,273</td>
</tr>
<tr>
<td>1981</td>
<td>2,360,975</td>
</tr>
<tr>
<td>2011</td>
<td>4,973,438</td>
</tr>
</table>
<p>Living to 95 is quite rare so it's safe to assume that the numbers above represent different batches of 65 year-olds. From this, we see that the 4,973,438 seniors alive last year were 58% of the seniors who have lived in Canada since 1921. This is less than two thirds even if we believe (<a href="http://en.wikipedia.org/wiki/William Lyon Mackenzie">the</a> <a href="http://en.wikipedia.org/wiki/Laura Secord">rather</a> <a href="http://en.wikipedia.org/wiki/Jean Talon">false</a> <a href="http://en.wikipedia.org/wiki/Jacques Cartier">statement</a>) that no Canadians lived to be 65 before 1921. If the rest of the world is anything like Canada, a minority of the seniors who have ever lived are alive today. People have <a href="http://apps.business.ualberta.ca/rfield/lifeexpectancy.htm">an exaggerated picture</a> of how short human lifespans used to be.</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/10">nerd humour</a></div><div class="field-item odd"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item even"><a href="/taxonomy/term/22">pseudoscience</a></div></div></div>Fri, 21 Dec 2012 00:15:09 +0000root35 at https://smallperturbation.comCalculus On The Surface Of A Box
https://smallperturbation.com/ants-doing-calculus
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Lines drawn on a curved surface can be tricky. Consider one of the most basic facts about Euclidean geometry: that parallel lines never intersect. This does not hold on the surface of the Earth. If you and a friend stand one metre apart in some location on the Earth, you could both try drawing a line and heading due north. The lines would appear to be parallel during all stages of the journey but you would eventually find that they intersect at the North Pole. An observer watching from space would say that the lines don't look parallel but any measurement you could make without leaving the Earth would tell you that they are. This is because a sphere is <i>locally flat</i>. The fact that the lines cross can therefore be used to <i>prove</i> that the world is round. Similarly, measurements done in three dimensional space, might be able to prove things like that about <i>the universe</i>.</p>
<p><img src="/sites/default/files/post_images/2012-08-01_parallel_lines.png" alt="The railroad tracks look parallel whether they are on a plane or a sphere." /></p>
<p>Now it's not just parallel lines that we have to worry about. If space were not flat, circle areas would appear to differ from <img class="teximage" src="/sites/default/files/tex/8228a7e91014dda1124b2f334affced0a52147bf.png" alt="$ \frac{1}{2} \tau r^2 $" /> and the sum of the angles in a triangle would appear to differ from <img class="teximage" src="/sites/default/files/tex/05a6306d668ae86faa10205c9707165fb201b1e4.png" alt="$ \frac{1}{2} \tau $" />. There is a wonderful formalism for doing calculus with lines drawn on a curved surface and there are two contexts in which people normally learn it. One is <a href="http://en.wikipedia.org/wiki/Great_circle">navigation</a> and the other is <a href="http://en.wikipedia.org/wiki/General_relativity">general relativity</a>. I want to take a shot at explaining it in the context of <a href="http://www.smallperturbation.com/two-ants">the two ant problem</a>. This requires us to find the shortest distance between two points on a curved surface <i>and prove it is the shortest</i>. So if I hadn't already solved the problem, it might seem appropriate to use the techniques that were historically used to prove that a straight line is the shortest path between two points in flat space and that a great circle is the shortest path between two points on a sphere.</p>
<!--break--><p>In situations like this, we should begin by choosing a co-ordinate system for our box. We will use polar co-ordinates. Those describe locations in terms of two angles <img class="teximage" src="/sites/default/files/tex/edc27ee97f4056d6bc3e6b823ea6193f8ef99a76.png" alt="$ (\theta, \phi) $" /> where <img class="teximage" src="/sites/default/files/tex/40cb2eaecf0fdf938989a51045ef5b89bc0602ae.png" alt="$ 0 \leq \theta < \frac{1}{2}\tau $" /> and <img class="teximage" src="/sites/default/files/tex/d7d04c28dcfaf9e555b9c17f27525a9dc96058b6.png" alt="$ 0 \leq \phi < \tau $" />. If we wanted to describe interior points in the whole rectangular prism, the co-ordinates would be <img class="teximage" src="/sites/default/files/tex/3d9eb85e1d4af3e84068e96dc29865ccf3d03324.png" alt="$ (r, \theta, \phi) $" />. This is just like how <img class="teximage" src="/sites/default/files/tex/edc27ee97f4056d6bc3e6b823ea6193f8ef99a76.png" alt="$ (\theta, \phi) $" /> parametrize a sphere while <img class="teximage" src="/sites/default/files/tex/3d9eb85e1d4af3e84068e96dc29865ccf3d03324.png" alt="$ (r, \theta, \phi) $" /> parametrize a ball. A sphere and a box are both manifolds and they yield similar descriptions because they are <i>homeomorphic</i> manifolds. You may have heard that a torus is a manifold that is not homeomorphic to a sphere. One way to see why is to draw a line of constant <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" /> and <img class="teximage" src="/sites/default/files/tex/9c4c905cdc3f98ee64c2a7affff5f991ac6f18f5.png" alt="$ \phi $" />. This hits two points on the torus.</p>
<h2>Angles should label a unique point</h2>
<table>
<tr>
<td>
<img src="/sites/default/files/post_images/2012-08-01_polar_coords.png" alt="A box and a sphere are well described by the polar co-ordinate system, a torus is not." />
</td>
<td>
A curve is a continuously indexed set of points so at time <img class="teximage" src="/sites/default/files/tex/b02586d1abd601a350ffa37d1b25b93f24016327.png" alt="$ t $" />, we can refer to our position on the path we end up taking by <img class="teximage" src="/sites/default/files/tex/06188ab2180e2f0b691df8e896b7ce10d10c8198.png" alt="$ (\theta(t), \phi(t)) $" />. To find the length of a curve, we can look at tiny tangents to the curve at a bunch of different times and add up all of their lengths. Therefore we want to minimize
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/589d1f79f42975689f131d8dec8f062908f718f7.png" alt="\[<br /> S = \int_{t_1}^{t_2} \left < \left ( \frac{\textup{d}\theta}{\textup{d}t}, \frac{\textup{d}\phi}{\textup{d}t} \right ), \left ( \frac{\textup{d}\theta}{\textup{d}t}, \frac{\textup{d}\phi}{\textup{d}t} \right ) \right > \textup{d}t<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where <img class="teximage" src="/sites/default/files/tex/1d5ec095ca049600bf63571cce3122e2f13cd1db.png" alt="$ t_1 $" /> and <img class="teximage" src="/sites/default/files/tex/0106e808d880e824f6464e7d075d724c4e705358.png" alt="$ t_2 $" /> are starting and ending times that we pick. Something that helps us talk about the length of a vector is an <a href="http://en.wikipedia.org/wiki/Inner_product">inner product</a> and that's what the angular brackets above represent. We need to figure out what a natural inner product would be on a curved shape.</p>
<p>The standard inner product in Euclidean space just tells us that squared length is the sum of the squares of the components. Expressing this as an <i>infinitesimal line element</i>, it looks like:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/773bc936b9acb76c8325d0811ec324998ff2a528.png" alt="\[<br /> \textup{d}s^2 = \textup{d}x^2 + \textup{d}y^2 + \textup{d}z^2<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>If we are on a two-dimensional curved surface like a sphere of radius <img class="teximage" src="/sites/default/files/tex/e8f5c92bd1ae357b639e168eb563dcffbe03ad8b.png" alt="$ r $" />, we don't label our points by <img class="teximage" src="/sites/default/files/tex/ce868e841ee2ba32ad612170f464a61be9a4e8d4.png" alt="$ (x, y, z) $" /> anymore, we label them by <img class="teximage" src="/sites/default/files/tex/edc27ee97f4056d6bc3e6b823ea6193f8ef99a76.png" alt="$ (\theta, \phi) $" />. Does this mean our line element is just <img class="teximage" src="/sites/default/files/tex/9774c69eeb1af966ee2d4a3624e41047c9d7dc8b.png" alt="$ \textup{d}s^2 = \textup{d}\theta^2 + \textup{d}\phi^2 $" />? No... this wouldn't make any sense. Lengths should depend on <img class="teximage" src="/sites/default/files/tex/e8f5c92bd1ae357b639e168eb563dcffbe03ad8b.png" alt="$ r $" /> in some way because a path going once around a large sphere is clearly longer than a path going once around a small sphere. We can get the actual line element if we plug the Cartesian representation for a sphere:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/fa07fc80e6c06ad592b920d8f8ca41a19eb63254.png" alt="\begin{align*}<br /> x &= r \sin \theta \cos \phi \\<br /> y &= r \sin \theta \sin \phi \\<br /> z &= r \cos \theta<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>into the Euclidean line element. What we get is:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/82aa6b18b6db854cdcb7054db8e9611879d151ea.png" alt="\[<br /> \textup{d}s^2 = r^2 \textup{d}\theta^2 + r^2 \sin^2 \theta \textup{d}\phi^2<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
</td>
</tr>
</table>
<p>Why are we interested in the <i>infinitesimal</i> version of the line element? In the Euclidean case, we could have simply written <img class="teximage" src="/sites/default/files/tex/318db06b4f5324633cdc211ff8d6cc5c53d9cf14.png" alt="$ s^2 = x^2 + y^2 + z^2 $" /> without the differentials. But for a sphere, the coefficients appearing beside the squared lengths, depend on the co-ordinate value. If we try to write down the length of a path between two points that are separated by a finite distance, the co-ordinate values would change as we moved along this path. So writing down an infinitesimal line element and integrating is what makes the most sense. Let's see if we can write down the equivalent object for a box.</p>
<p>In two dimensional polar co-ordinates the equation of the vertical line <img class="teximage" src="/sites/default/files/tex/bea14078c63b4175fad84678dbfd9c88d87dd844.png" alt="$ x = \pm 1 $" /> is <img class="teximage" src="/sites/default/files/tex/bf70d80330f9b65dd1600420aee31b3b3a6b7eea.png" alt="$ r = \pm \frac{1}{\cos \phi} $" />. Similarly the horizontal line <img class="teximage" src="/sites/default/files/tex/f090d96284255b58a9afd0718b90af290fcc9605.png" alt="$ y = \pm 1 $" /> is <img class="teximage" src="/sites/default/files/tex/fdcf553d14bc9a269d02a902576046fe1f9b162f.png" alt="$ r = \pm \frac{1}{\sin \phi} $" />. Using this logic, an <img class="teximage" src="/sites/default/files/tex/377389c6b62abbd10fe5c62abf63c2d870d6c213.png" alt="$ A \times B \times C $" /> box can be described by:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5ca1fa4e6048eb8105cdd193788c5ecb14a34cc6.png" alt="\begin{align*}<br /> x &= A r(\theta) r(\phi) \cos \theta \cos \phi \\<br /> y &= B r(\theta) r(\phi) \cos \theta \sin \phi \\<br /> z &= C r(\theta) \sin \theta<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where <img class="teximage" src="/sites/default/files/tex/e8f5c92bd1ae357b639e168eb563dcffbe03ad8b.png" alt="$ r $" /> is the piecewise function:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/2738df01358dd683da22e1612c06388f64d5f7b6.png" alt="\[<br /> r(\theta) =<br /> \begin{cases}<br /> \frac{1}{\cos \theta} & -\frac{\tau}{8} \leq \theta < \frac{\tau}{8} \\<br /> \frac{1}{\sin \theta} & \frac{\tau}{8} \leq \theta < \frac{3\tau}{8} \\<br /> \frac{-1}{\cos \theta} & \frac{3\tau}{8} \leq \theta < \frac{5\tau}{8} \\<br /> \frac{-1}{\sin \theta} & \frac{5\tau}{8} \leq \theta < \frac{7\tau}{8}<br /> \end{cases}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>We can plug this into the Euclidean line element and perform a bunch of simplifications. Since the two ant problem deals with a <img class="teximage" src="/sites/default/files/tex/3698f2d6f2fdf63c07694c5644a19b7066772d76.png" alt="$ 12 \times 12 \times 30 $" /> box, we can also specialize to <img class="teximage" src="/sites/default/files/tex/ff587fdd395f3e6221a6c209ec92cf1d0e987e58.png" alt="$ A = B $" />. After we do this, our line element for the box is still quite messy:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5503a63523cd15e2ee8a20cf6393efc63651fb58.png" alt="\begin{align*}<br /> \textup{d}s^2 &= \left [ A \left ( r^{\prime2}(\theta) r^2(\phi) \cos^2 \theta + r^2(\theta) r^2(\phi) \sin^2 \theta - 2 r^{\prime}(\theta) r(\theta) r^2(\phi) \cos \theta \sin \theta \right ) \right \none + \\<br /> &\left \none C \left ( r^{\prime2}(\theta) \sin^2 \theta + r^2(\theta) \cos^2 \theta + 2r^{\prime}(\theta)r(\theta) \sin \theta \cos \theta \right ) \right ] \textup{d}\theta^2 + \\<br /> &A^2 \left ( r^2(\theta)r^{\prime2}(\phi) \cos^2 \theta + r^2(\theta)r^2(\phi) \cos^2 \theta \right ) \textup{d}\phi^2 + \\<br /> &2A^2 \left ( r^{\prime}(\theta) r(\theta) r^{\prime}(\phi)r(\phi) \cos^2 \theta - r^2(\theta)r^{\prime}(\phi)r(\phi) \cos \theta \sin \theta \right ) \textup{d}\theta \textup{d}\phi<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This seems like a good way of getting the line element for a manifold if we know it is embedded in a higher dimensional Euclidean space. What about the inverse problem? Does every line element belong to a shape embedded in Euclidean space? The <a href="http://en.wikipedia.org/wiki/Nash_embedding_theorem">Nash embedding theorem</a> says yes. The hyperbolic plane described by <img class="teximage" src="/sites/default/files/tex/a2cd1c770a6d8c8d3bb76fe023213f1e28221b54.png" alt="$ \textup{d}s^2 = \frac{1}{y^2} \left (\textup{d}x^2 + \textup{d}y^2 \right), \; y > 0 $" /> cannot be embedded in <img class="teximage" src="/sites/default/files/tex/6fe2b93f1dcdf23b4ddc2049997f8b3089cc99ba.png" alt="$ \mathbb{R}^3 $" />. It can be embedded in <img class="teximage" src="/sites/default/files/tex/932433bc9b1d3c5db8e818bea55b80da2ebaa471.png" alt="$ \mathbb{R}^5 $" />. Whether or not four dimensions are enough is, at the time of writing, unknown.</p>
<p>How do we get an inner product out of this? A general inner product in finite dimensions is given by <img class="teximage" src="/sites/default/files/tex/364d9467b1584b00704692589a87df6e9ec2914a.png" alt="$ \left <u, v \right > = u^{\textup{T}} G v $" /> where <img class="teximage" src="/sites/default/files/tex/5a89a05dc5c76dddab67c6c0758ad36e3e3b8689.png" alt="$ G $" /> is a symmetric, positive-definite matrix. Now notice that <img class="teximage" src="/sites/default/files/tex/c61b5ac523f095980cb7a1f300ffe20689cfa0e3.png" alt="$ \textup{d}s^2 = g_{\theta \theta} \textup{d}\theta^2 + 2 g_{\theta \phi} \textup{d}\theta \textup{d}\phi + g_{\phi \phi} \textup{d}\phi^2 $" /> is precisely the expression:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5a696c7a8c51447ea5775265e17d83f22a56c8f5.png" alt="\[<br /> \textup{d}s^2 = [\textup{d} \theta \; \; \textup{d}\phi] \left [ \begin{tabular}{cc} g_{\theta \theta} & g_{\theta \phi} \\ g_{\theta \phi} & g_{\phi \phi} \end{tabular} \right ] \left [ \begin{tabular}{c} \textup{d}\theta \\ \textup{d}\phi \end{tabular} \right ]<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This means that from a line element, we can read off the components of a "matrix" <img class="teximage" src="/sites/default/files/tex/5a89a05dc5c76dddab67c6c0758ad36e3e3b8689.png" alt="$ G $" />. This bilinear map is called the <i>metric tensor</i> or the <i>Riemannian metric</i>. A manifold equipped with one of these is called a <i>Riemannian manifold</i>. The fact that distances on a surface are always positive forces the matrix to be positive-definite (all eigenvalues positive). As an aside, we can consider another case where one eigenvalue is negative and the rest of them are positive. This is what happens in relativity which is concerned with paths through <i>spacetime</i> rather than paths through space. The one negative eigenvalue corresponds to the one time direction. In this case, flat spacetime is <i>Minkowski space</i> rather than Euclidean space. And instead of a Riemannian manifold, we have a <i>Lorentzian manifold</i>. Lorentzian manifolds have given rise to a subject called <a href="http://en.wikipedia.org/wiki/Causal_structure">causality theory</a>.</p>
<p>Now that we know what our inner product is, we can go back to the integral <img class="teximage" src="/sites/default/files/tex/5e5629a35a0da8105df0085e04721871fabab195.png" alt="$ S $" /> which gives us the length of a path joining the two ants.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/509d79b69bd112033eb0b517c1dc501eb3a3fd96.png" alt="\begin{align*}<br /> S &= \int_{t_1}^{t_2} \mathcal{L} \; \textup{d}t \\<br /> \mathcal{L} &= \left [ \frac{\textup{d}\theta}{\textup{d}t} \;\; \frac{\textup{d}\phi}{\textup{d}t} \right ] G \left [ \frac{\textup{d}\theta}{\textup{d}t} \;\; \frac{\textup{d}\phi}{\textup{d}t} \right ]^{\textup{T}}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p><img class="teximage" src="/sites/default/files/tex/7bfee6ef69cd0b0f6902be857aba97d2b3140b36.png" alt="$ \mathcal{L} $" /> is the object whose integral we want to minimize. This is called a <i>Lagrangian</i>. Another Lagrangian that comes up a lot is the kinetic energy minus the potential energy of a mechanical system. We minimize <img class="teximage" src="/sites/default/files/tex/5e5629a35a0da8105df0085e04721871fabab195.png" alt="$ S $" /> by picking appropriate functions for <img class="teximage" src="/sites/default/files/tex/06188ab2180e2f0b691df8e896b7ce10d10c8198.png" alt="$ (\theta(t), \phi(t)) $" />. <img class="teximage" src="/sites/default/files/tex/5e5629a35a0da8105df0085e04721871fabab195.png" alt="$ S $" /> is a function that we minimize, not by plugging in a number, but by plugging in another function. Some people call this a <i>functional</i>. For the functional <img class="teximage" src="/sites/default/files/tex/5e5629a35a0da8105df0085e04721871fabab195.png" alt="$ S $" /> to be minimized, a necessary condition is that its <a href="http://en.wikipedia.org/wiki/Functional_derivative">functional derivative</a> be zero. This is equivalent to solving the <i>Euler Lagrange equations</i>:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/a59f13ccfb8619786c0c09661cd082396a264c6b.png" alt="\begin{align*}<br /> \frac{\partial \mathcal{L}}{\partial \theta} &= \frac{\textup{d}}{\textup{d}t} \frac{\partial \mathcal{L}}{\partial \frac{\textup{d}\theta}{\textup{d}t}} \\<br /> \frac{\partial \mathcal{L}}{\partial \phi} &= \frac{\textup{d}}{\textup{d}t} \frac{\partial \mathcal{L}}{\partial \frac{\textup{d}\phi}{\textup{d}t}}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>For this particular Lagrangian, another equivalent procedure is solving the <a href="http://en.wikipedia.org/wiki/Geodesic_equation">geodesic equations</a>. The geodesic equations are usually written out in an arbitrary number of dimensions with <img class="teximage" src="/sites/default/files/tex/9837c4f2cd0e02dfdeb22b5ab4f30f62acbf0501.png" alt="$ x^i $" /> denoting the coordinates.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/47c2243005c64b83022b190773a347fc83ab6b07.png" alt="\[<br /> \frac{\textup{d}^2x^i}{\textup{d}t^2} + \sum_{j, k} \Gamma^i_{j k} \frac{\textup{d}x^j}{\textup{d}t} \frac{\textup{d}x^k}{\textup{d}t} = 0<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>The <img class="teximage" src="/sites/default/files/tex/a677c643f61bc1d62106eec8dfff0518cfa40e3f.png" alt="$ \Gamma $" />s are called the <i>Christoffel symbols</i> and they can be tediously computed with the formula:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/66f5021397c1fe5efbc4d230cb181f9aa051b164.png" alt="\[<br /> \Gamma^i_{j k} = \frac{1}{2} \sum_l g^{i l} \left ( \frac{\partial g_{j l}}{\partial x^k} + \frac{\partial g_{k l}}{\partial x^j} - \frac{\partial g_{j k}}{\partial x^l}\right )<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>The <img class="teximage" src="/sites/default/files/tex/fa99f85c022502724540c8b88ce5e5710602db10.png" alt="$ g $" />s with lower indices are the components of <img class="teximage" src="/sites/default/files/tex/5a89a05dc5c76dddab67c6c0758ad36e3e3b8689.png" alt="$ G $" /> while the <img class="teximage" src="/sites/default/files/tex/fa99f85c022502724540c8b88ce5e5710602db10.png" alt="$ g $" />s with upper indices are the components of <img class="teximage" src="/sites/default/files/tex/f1795e8ff1c05109f677020d74f678ba41836f62.png" alt="$ G^{-1} $" />.</p>
<p>If we do all this for our metric on the box, we get a pair of coupled differential equations that in principle allow us to actually solve for the shortest path between two ants:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/c994eb162cdad3f88bd90c0b954c751495a1d153.png" alt="\begin{align*}<br /> &\frac{\textup{d}^2\theta}{\textup{d}t^2} + \frac{1}{2(g_{\theta \theta} g_{\phi \phi} - g_{\theta \phi}^2)} \left [ \left ( g_{\phi \phi} \frac{\partial g_{\theta \theta}}{\partial \theta} - g_{\theta \phi} \left( 2\frac{\partial g_{\theta \phi}}{\partial \theta} - \frac{g_{\theta \theta}}{\partial \phi} \right ) \right ) \left ( \frac{\textup{d}\theta}{\textup{d}t} \right )^2 \right \none \\<br /> &+ \left \none \left ( g_{\phi \phi} \frac{\partial g_{\theta \theta}}{\partial \phi} - g_{\theta \phi} \frac{\partial g_{\phi \phi}}{\partial \theta} \right ) \frac{\textup{d}\theta}{\textup{d}t} \frac{\textup{d}\phi}{\textup{d}t} + \left ( g_{\phi \phi} \left ( 2\frac{\partial g_{\theta \phi}}{\partial \phi} - \frac{\partial g_{\phi \phi}}{\partial \theta} \right ) - g_{\theta \phi} \frac{\partial g_{\phi \phi}}{\partial \phi} \right ) \left ( \frac{\textup{d}\phi}{\textup{d}t} \right )^2\right ] = 0 \\<br /> &\frac{\textup{d}^2\phi}{\textup{d}t^2} + \frac{1}{2(g_{\theta \theta} g_{\phi \phi} - g_{\theta \phi}^2)} \left [ \left ( g_{\theta \theta} \left ( 2\frac{\partial g_{\theta \phi}}{\partial \theta} - \frac{\partial g_{\theta \theta}}{\partial \phi} \right ) - g_{\theta \phi} \frac{\partial g_{\theta \theta}}{\partial \theta} \right ) \left ( \frac{\textup{d}\theta}{\textup{d}t} \right )^2 \right \none \\<br /> &+ \left \none \left ( g_{\theta \theta} \frac{\partial g_{\phi \phi}}{\partial \theta} - g_{\theta \phi} \frac{\partial g_{\theta \theta}}{\partial \phi} \right ) \frac{\textup{d}\theta}{\textup{d}t} \frac{\textup{d}\phi}{\textup{d}t} + \left ( g_{\theta \theta} \frac{\partial g_{\phi \phi}}{\partial \phi} - g_{\theta \phi} \left ( 2\frac{g_{\theta \phi}}{\partial \phi} - \frac{\partial g_{\phi \phi}}{\partial \theta} \right ) \right ) \left ( \frac{\textup{d}\phi}{\textup{d}t} \right )^2\right ] = 0<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This already looks complicated and we will make it even moreso when we plug in the <img class="teximage" src="/sites/default/files/tex/fa99f85c022502724540c8b88ce5e5710602db10.png" alt="$ g $" /> values that come from our complicated expression for <img class="teximage" src="/sites/default/files/tex/42675c1df499161a08bca526664b50d7e803dff6.png" alt="$ \textup{d}s^2 $" />. Writing down a geodesic equation for a certain shape is a great way of coming up with a really complicated differential equation. A solution will exist by the <a href="http://en.wikipedia.org/wiki/Hopf%E2%80%93Rinow_theorem">Hopf-Rinow theorem</a> but we don't have a chance in hell of being able to solve this analytically.</p>
<h2>A plot I don't really trust</h2>
<p><img src="/sites/default/files/post_images/2012-08-01_geodesic.png" alt="A supposed geodesic on the box obtained numerically" /><br />
I tried to solve these equations numerically with a poor man's Euler method. In order to do this, we need initial co-ordinates. The male ant's position <a href="http://www.smallperturbation.com/two-ants">eleven twelfths of the way between the floor and the ceiling</a> tells us that <img class="teximage" src="/sites/default/files/tex/4ff73b6076fdd4e95716279866f3f976f8cf71e2.png" alt="$ (\theta(0), \phi(0)) = \left (\frac{\tau}{4} - \arcsin \frac{1}{3}, 0 \right ) $" />. Since we are solving second order equations, we also need an initial velocity. I tried to see if the algorithm would find the length 40.7 path, so I made the initial velocity <img class="teximage" src="/sites/default/files/tex/2050eff702d333a27787d38c789906defa378f78.png" alt="$ \left ( \frac{\textup{d}\theta}{\textup{d}t}(0), \frac{\textup{d}\phi}{\textup{d}t}(0) \right ) = (37\alpha, 17\alpha) $" />. Only the ratio of the two derivatives matters because this defines our initial direction. Geodesics obtained by using different values of <img class="teximage" src="/sites/default/files/tex/52f8e2dbc855bfec2199c16b2d89f8040b6fd46f.png" alt="$ \alpha $" /> will still chart out the same paths for the ants on the box. The ants will just cruise along them at different speeds.</p>
<p>Anyway, it doesn't look like this solution heads towards the female ant. I think there is some instability in the algorithm that causes the geodesic to abruptly change course when it gets to one of the edges of the box. Even if I were to get rid of this instability, a numerical simulation still cannot prove that we found the shortest path. This instability might arise because the differential equation is actually not defined on the edges. This is because our function <img class="teximage" src="/sites/default/files/tex/e8f5c92bd1ae357b639e168eb563dcffbe03ad8b.png" alt="$ r $" /> is piecewise. I tried to approximate <img class="teximage" src="/sites/default/files/tex/e8f5c92bd1ae357b639e168eb563dcffbe03ad8b.png" alt="$ r $" /> as a smooth function making the box lack true edges. This did not seem to help much, but it's definitely possible to invest more effort into it than I did.</p>
<p>In order to make <img class="teximage" src="/sites/default/files/tex/2576fd6b92221344e344dc674e1b432faeccaf67.png" alt="$ \frac{1}{\cos\theta} $" /> turn into <img class="teximage" src="/sites/default/files/tex/bb6c7bd7ada06eb6a500b4cc8eb2c8cfc85506a3.png" alt="$ \frac{1}{\sin\theta} $" /> when the angle is right, I needed to approximate a step function. Luckily, <a href="http://fooplot.com/atan(99*x)">arctan</a> does this well. Instead of going between 0 and 1, it goes between <img class="teximage" src="/sites/default/files/tex/1ef4a6903a8122b5970fdfa5e97f42d24340ba7e.png" alt="$ -\frac{\tau}{4} $" /> and <img class="teximage" src="/sites/default/files/tex/9efc6f983ed676f1d2af72307faba43686f654cd.png" alt="$ \frac{\tau}{4} $" /> so we have to perform an affine transformation. <img class="teximage" src="/sites/default/files/tex/d75903bf75bfeaa410679a59db513008ed0a5e8c.png" alt="$ \frac{2}{\tau} \arctan (N x) + \frac{1}{2} $" /> jumps from 0 to 1 at <img class="teximage" src="/sites/default/files/tex/ec8d86c0cad293d59bf5dca52194b05184ab3d53.png" alt="$ x = 0 $" /> if <img class="teximage" src="/sites/default/files/tex/b7846edac15ac3666e167971fcc8342a20414835.png" alt="$ N $" /> is large. Now we just need a function that is periodic and positive one quarter of the time. This was hard for me to find at first because the familiar trigonometric functions are positive half of the time. The answer to this question of mathematical artistry came when I thought about Dirichlet kernels. I realized that I could just sum up cosines. <img class="teximage" src="/sites/default/files/tex/e2f405da13540548607dfea6043a359c3fc14ac0.png" alt="$ \cos\theta + \cos2\theta - \frac{1}{\sqrt{2}} $" /> is positive only in the region <img class="teximage" src="/sites/default/files/tex/ee2813a39d6086c93fc4bbb13342762f1f89f1e7.png" alt="$ -\frac{\tau}{8} \leq \theta < \frac{\tau}{8} $" /> and we can obtain functions that are positive in the other desired regions by shifting the argument. We therefore have:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/278320879817045ed9de58708f1797502f4e1b8d.png" alt="\begin{align*}<br /> r(\theta) &= \frac{1}{\cos\theta} \left [ \arctan N \left( \cos\theta + \cos2\theta - \frac{1}{\sqrt{2}} \right ) - \arctan N \left( -\cos\theta + \cos2\theta - \frac{1}{\sqrt{2}} \right ) \right ] \\<br /> &+ \frac{1}{\sin\theta} \left [ \arctan N \left( \sin\theta - \cos2\theta - \frac{1}{\sqrt{2}} \right ) - \arctan N \left( -\sin\theta - \cos2\theta - \frac{1}{\sqrt{2}} \right )\right ]<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This makes our pair of differential equations <i>even more complicated</i> but I think it's fun to approximate shapes like this.</p>
<h2>We'll have to land on that small box up ahead...</h2>
<p><img src="/sites/default/files/post_images/2012-08-01_box1.png" alt="Box approximation for small N" /></p>
<h2>That's no box...</h2>
<p><img src="/sites/default/files/post_images/2012-08-01_box2.png" alt="Box approximation for medium N" /></p>
<h2>That's a limit of a sequence of smooth surfaces!</h2>
<p><img src="/sites/default/files/post_images/2012-08-01_box3.png" alt="Box approximation for large N" /></p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/14">too much information</a></div><div class="field-item odd"><a href="/taxonomy/term/15">fun calculations</a></div></div></div>Thu, 02 Aug 2012 01:29:01 +0000root31 at https://smallperturbation.comAnts On A Box
https://smallperturbation.com/two-ants
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Two and a half years ago, I worked for a teaching program at Queen's that could help first year calculus students get extra marks. It was called <a href="http://www.mast.queensu.ca/~math122/Investigations.shtml">Math Investigations</a> and its purpose was to show interesting problems that students might not see in a regular class. As a third year math student, I could solve most of them in a brief sitting but one problem called "two ants" eluded us and it just happened to be the first problem we presented.<br />
</p><center><br />
<img src="/sites/default/files/post_images/2012-06-01_two_ants.png" alt="The initial setup showing where the ants are." /><br />
</center><br />
My partner for TA'ing that section was <a href="http://cviz.wordpress.com/">James McLean</a> and we were later joined by <a href="http://www.stanford.edu/~robjwang/">Rob Wang</a>. We downloaded <a href="http://www.mast.queensu.ca/~peter/inprocess/contents.htm">problem sets</a> picked out by the head of the program - <a href="http://www.mast.queensu.ca/~peter/index.htm">Peter Taylor</a> - and he also happens to list the ant problem first. The premise is that a 12x12x30 box houses a male ant and a female ant and they are located on the square ends. One is eleven twelfths of the way between the floor and the ceiling, the other is eleven twelfths of the way between the ceiling and the floor. One ant wants to meet the other by crawling on the surface of the box - taking the shortest possible path. If you read <a href="/sites/default/files/post_images/2012-06-01_two_ants.pdf">the solution</a> we were given, you will see that it does <i>not</i> prove which path is the shortest or give a real indication of how one might do so.
<!--break--><p>Finding the shortest distance "as the crow flies" is trivial, but it is harder to find the shortest path among those constrained to the box. The approach favoured by the notes is to unfold the box in a few different ways and compare the lengths of a few different paths you can draw when the box is unfolded. You can get different lengths depending on how you unfold it.<br />
</p><center>
<table>
<tr>
<td>
<img src="/sites/default/files/post_images/2012-06-01_net_example1.png" alt="Path of length root(1568)" width="300" />
</td>
<td>
<img src="/sites/default/files/post_images/2012-06-01_net_example2.png" alt="Path of length 40" width="300" />
</td>
</tr>
</table>
<p></p></center><br />
This is what we were supposed to be teaching - that you should keep trying to improve your answer rather than being satisfied so easily. Two admissible nets for the box are shown above and one path length is <img class="teximage" src="/sites/default/files/tex/3069865b14ebc6dafcdc5017f37d589a11d8e21b.png" alt="$ \sqrt{17^2 + 37^2} \approx 40.7 $" /> while the other is <img class="teximage" src="/sites/default/files/tex/bc4f21b1a56a711ffae23e12e4c8b8ec0b5b03c8.png" alt="$ \sqrt{24^2 + 32^2} = 40 $" />. The notes claim without proof that the path on the right is the shortest path joining the two ants that one can ever find. How many times do we have to unpack the box in order to prove this?
<p>What we are doing is mapping the box onto the plane in a way that preserves distance. A function between different surfaces that preserves distance is called an <i>isometry</i>. Since we know that the shortest path between any two points in a plane is a straight line, we can say that the shortest path between the corresponding points on the surface is simply the image of this straight line under the isometry. Of course, the catch is that we are not using an isometry between the plane and the entire box - there is no such thing. It is the <i>restriction</i> of an unpacking scheme to a subset of the box that is an isometry.<br />
</p><center>
<table>
<tr>
<td>
<img src="/sites/default/files/post_images/2012-06-01_shaded_region1.png" alt="Diagram showing points that can be reached from the starting point using a straight line if we unpack the box in the first way." width="300" />
</td>
<td>
<img src="/sites/default/files/post_images/2012-06-01_shaded_region2.png" alt="Diagram showing points that can be reached from the starting point using a straight line if we unpack the box in the second way." width="300" />
</td>
</tr>
</table>
<p></p></center><br />
The same two nets are shown above and this time there is a shaded region. This represents the subset of the box to which we must restrict ourselves if we want distances to still be preserved. A path between the male ant and any green point will have the same length on the folded box as on the unfolded box. The distances found above are different because neither is contained in both shaded regions. The path of length 40.7 goes outside the right image's "safe zone" and the path of length 40 goes outside the left image's "safe zone". We've drawn two safe zones so far but there are many more.
<p>As a bit of a digression, finding the distinct nets for a polyhedron is not necessarily easy. <a href="http://www.sciencedirect.com/science/article/pii/S0012365X97002252">A 1998 paper</a> counts the number of nets for large polyhedra and polytopes and the procedure is quite involved. There are <a href="http://gwydir.demon.co.uk/jo/solid/cube.htm">11 distinct nets</a> of a cube. We can use this to find out how many distinct nets produce a rectangular prism with a square base. We can start with a cubic net like this one:<br />
</p><center><br />
<img src="/sites/default/files/post_images/2012-06-01_cube_net.png" alt="One net for a cube." width="400" /><br />
</center><br />
We can then choose one face not to elongate. Whether or not another face gets elongated is now completely determined by this so there are initially six ways to turn this into one of the rectangular nets we need. However, only three of them will be distinct because whenever you choose one face not to elongate, you could have just as easily chosen the face across from it and got the same result. This particular cube net produces three rectangular prism nets, but we are not done.<br />
<center>
<table>
<tr>
<td>
<img src="/sites/default/files/post_images/2012-06-01_cube_to_rect1.png" alt="First way to elongate some faces." width="300" />
</td>
<td>
<img src="/sites/default/files/post_images/2012-06-01_cube_to_rect3.png" alt="Third way to elongate some faces." width="300" />
</td>
</tr>
</table>
<p><img src="/sites/default/files/post_images/2012-06-01_cube_to_rect2.png" alt="Second way to elongate some faces." width="400" /><br />
</p></center><br />
When we make a cut through the box in order to unfold it, it matters whether we are cutting below the ant, above it, or on the sides - so for each one of the above diagrams, there are really four possibilities for what its safe zone could be. The four possibilities for the bottom net are shown below.<br />
<center>
<table>
<tr>
<td>
<img src="/sites/default/files/post_images/2012-06-01_sub_net1.png" alt="One of the four sub-nets corresponding to the bottom net." width="300" />
</td>
<td>
<img src="/sites/default/files/post_images/2012-06-01_sub_net2.png" alt="One of the four sub-nets corresponding to the bottom net." width="300" />
</td>
</tr>
<tr>
<td>
<img src="/sites/default/files/post_images/2012-06-01_sub_net3.png" alt="One of the four sub-nets corresponding to the bottom net." width="300" />
</td>
<td>
<img src="/sites/default/files/post_images/2012-06-01_sub_net4.png" alt="One of the four sub-nets corresponding to the bottom net." width="300" />
</td>
</tr>
</table>
<p></p></center><br />
We start with 11 choices, each of these gives rise to 3 further choices and each of these gives rise to 4 further choices. So by this logic, the number of planar ant problems to consider is <img class="teximage" src="/sites/default/files/tex/59be91760baa1e4a6ad688bcf88f58be349987d7.png" alt="$ 11 \cdot 3 \cdot 4 = 132 $" />. I doubt that anyone in the class considered this many cases. Even if they did, it wouldn't matter because there are an infinite number of ways to unpack a shape! If you draw the 132 nets in the procedure that was just described, you will not get this one:<br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_esoteric_net.png" alt="A net produced by cutting away from the edges that will give a different path length than any of the 132 canonical ones." width="400" /><br />
</center><br />
If you keep drawing nets like this until you have hundreds more, it will still be easy to think of a net that was missed. Now you might be saying that I cheated. I produced the net above by cutting through one of the faces while the 132 canonical nets only allowed cutting through the edges. But nothing is wrong with this. This is a perfectly valid mapping between the box and the plane and the red path I drew would not be within the safe zones of any of the previous 132, so we need this net. No matter how many nets you draw, you will always be able to find a path that goes outside all of their safe zones requiring you to draw one more. Maybe you're thinking that eventually these nets become so esoteric that they can't possibly yield the shortest path anymore - but that requires proof! Proving that we can get away with the net at the top of this post is the entire problem we are trying to solve!
<p>This is what stumped us during the Math Investigations class. When students warmed up to the idea of unpacking the box in many strange ways, they asked how many times the box would have to be unpacked. We didn't have an answer, and the longer we thought about it, the more we realized that the process would never finish. This is a hard problem so we admitted that there could easily be a path with a length shorter than 40 and we didn't know how to prove it either way. Now that I've come back to the problem, I've thought up a proof that I didn't think of before and it is nice and elementary. Please tell me if you find it as convincing as I do.</p>
<p>Before I get into the proof, I should say that it was inspired by a problem I had to solve in one of my fourth year courses - computational commutative algebra. We were learning the theory of <a href="http://en.wikipedia.org/wiki/Gr%C3%B6bner_basis">Gröbner bases</a> and for an ideal in a polynomial ring, one traditionally computes the Gröbner basis by first choosing a <a href="http://en.wikipedia.org/wiki/Monomial_order">monomial order</a> and then applying <a href="http://en.wikipedia.org/wiki/Buchberger%27s_algorithm">Buchberger's algorithm</a>. I was asked to find all possible Gröbner bases for a particular ideal. This seemed like an impossible task at first since there are an infinite number of monomial orders. However, you don't actually need to know the monomial order in its entirety. You just need to know how it ranks certain monomials and during any step in the algorithm, there are only a finite number of ways in which the monomials can be ranked. So by only choosing as much as I had to, I was able to get the Buchberger algorithm to split into a finite number of child Buchberger algorithms at each step and eventually, all the child processes finished and I got the answer (I was by no means the first person to think of doing this). This divide and conquer technique is amazingly powerful.</p>
<p>You don't have to know what any of this means but the conceptual situation here is similar. Running the algorithm is travelling along a path and keeping track of your distance. The infinitude of cases comes from the fact that there are infinitely many nets. For a given net, the algorithm is guaranteed to finish - we either get to the female ant in record time, or we discover that the path we are on has a length longer than 40 and we quit because we know we cannot be on the shortest path anymore. If we want to find the shortest distance between two points on <i>adjacent</i> faces of the box, this is surely a straight line because a subset of the cube containing only two adjacent faces can be completely mapped onto the plane using an isometry. At any given time, the only boundaries we ever cross are between adjacent faces, so instead of having to specify all unfoldings beforehand, at any given time, our unfolding only needs to be specified enough to accommodate the boundaries we have crossed so far. Without further ado, I will show you this whole process pictorially:<br />
</p><center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution1.png" alt="Net for the first angle range." width="400" /><br />
</center><br />
We begin our length 40 line in the horizontal direction at angle <img class="teximage" src="/sites/default/files/tex/d03b39574d7bdfff5df42af88994cf84f468135d.png" alt="$ \theta = 0 $" />. When discussing angles here, we will let <img class="teximage" src="/sites/default/files/tex/bd9c5c9bbb1cf75d0fee80dd7179fd33336a4bff.png" alt="$ \tau = 2\pi $" />. We can rotate the line without changing the boundaries we cross until we get to <img class="teximage" src="/sites/default/files/tex/6e4bdf4746e44ad502b5f9f5044ec65e3855c314.png" alt="$ \theta = \arctan\left( \frac{6}{31} \right) $" />. Notice how we've only specified the important parts of the net here. The next net is valid for <img class="teximage" src="/sites/default/files/tex/9187bf77138cf59e2e40cee690e6ae94fdf7681f.png" alt="$ \arctan\left( \frac{6}{31} \right) \leq \theta \leq \arctan\left( \frac{18}{31} \right) $" />:<br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution2.png" alt="Net for the second angle range." width="400" /><br />
</center><br />
This is a familiar situation - one in which we come 0.7 units away from hitting the female ant. Close but no cigar. The next net is valid for <img class="teximage" src="/sites/default/files/tex/145f440d2d9015bd39d4dd29e233785c4ab948e5.png" alt="$ \arctan\left( \frac{18}{31} \right) \leq \theta \leq \frac{\tau}{4} - \arctan\left( \frac{1}{6} \right) $" />:<br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution3.png" alt="Net for the third angle range." width="400" /><br />
</center><br />
This range includes the path that allows us to hit the female ant using 40 length units. If the rest of the angles we sweep out don't allow us to do this, we will have proven that 40 is the shortest distance. The next one works for <img class="teximage" src="/sites/default/files/tex/a989ec6f4b75515da40a2de670ba7e18174c2fad.png" alt="$ \frac{\tau}{4} - \arctan\left( \frac{1}{6} \right) \leq \theta \leq \frac{\tau}{4} - \arctan\left( \frac{1}{36} \right) $" />:<br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution4.png" alt="Net for the fourth angle range." width="200" /><br />
</center><br />
<img class="teximage" src="/sites/default/files/tex/7311a93c5fd889c3078ecbf22741a321c7eaa563.png" alt="$ \frac{\tau}{4} - \arctan\left( \frac{1}{36} \right) \leq \theta \leq \frac{\tau}{4} + \arctan\left( \frac{11}{36} \right) $" /><br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution5.png" alt="Net for the fifth angle range." width="550" /><br />
</center><br />
<img class="teximage" src="/sites/default/files/tex/fc44d3b6f8066ce93a5c826d0faa61e023b5fa44.png" alt="$ \frac{\tau}{4} + \arctan\left( \frac{11}{36} \right) \leq \theta \leq \frac{\tau}{2} - \arctan\left( \frac{6}{11} \right) $" /><br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution6.png" alt="Net for the sixth angle range." width="400" /><br />
</center><br />
<img class="teximage" src="/sites/default/files/tex/506cca09e72ac791152f725631c9196bb65c9741.png" alt="$ \frac{\tau}{2} - \arctan\left( \frac{6}{11} \right) \leq \theta \leq \frac{\tau}{2} $" /><br />
<center><br />
<img src="/sites/default/files/post_images/2012-06-01_solution7.png" alt="Net for the seventh angle range." width="350" /><br />
</center><br />
This brings us back to the axis of symmetry of the problem. We have found that no matter what our initial heading is, we will travel for at least 40 distance units before bridging the two ants. Sweeping out a path of length 40 and adjusting the net used to match the angle... it's so simple! I can't believe I didn't think of it before! Anyway, what we suggested to the first years would've used much higher level math. We talked about how the <a href="http://en.wikipedia.org/wiki/Great_circle">great circle</a> is the shortest path between two points on a sphere as proven by the calculus of variations. We proposed that a similar proof could be done for a box in principle but said that it would be hard. We didn't expect any of them to know the details of how to do that calculation. We tried to give a brief survey of what variational methods are in the remaining minutes, so this is something I might talk about later.
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item odd"><a href="/taxonomy/term/16">gratitude</a></div></div></div>Sat, 02 Jun 2012 01:23:35 +0000root29 at https://smallperturbation.comMessing With Ill-Defined Physics
https://smallperturbation.com/undefined
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>What's a fun thing to do when you learn a less than intuitive concept? Searching the web to find another person's opinion of it! I recently learned about <a href="http://en.wikipedia.org/wiki/Grassman_number">Grassman numbers</a> and my search turned up <a href="http://motls.blogspot.com/2011/11/celebrating-grassmann-numbers.html">a blog post</a> by a professor named Luboš Motl who makes some pretty debatable claims. After reading the post, I found out that he is actually <a href="http://en.wikipedia.org/wiki/Lubos_Motl">quite famous</a>. So yes, he probably knows much more than me about the subject, but I must still object to how complacent he is with using an object and not defining it.</p>
<!--break--><p>If we know that a particle has position <img class="teximage" src="/sites/default/files/tex/19f32e82560b85354ac1af5c40ed7ff967672b26.png" alt="$ x_1 $" /> at time <img class="teximage" src="/sites/default/files/tex/1d5ec095ca049600bf63571cce3122e2f13cd1db.png" alt="$ t_1 $" />... what is the probability that we will see it at position <img class="teximage" src="/sites/default/files/tex/39df588c18808aae4bb3d4ded70b2296fcb81f32.png" alt="$ x_2 $" /> at time <img class="teximage" src="/sites/default/files/tex/0106e808d880e824f6464e7d075d724c4e705358.png" alt="$ t_2 $" />? Well in classical physics, a particle's path is fully determined so if <img class="teximage" src="/sites/default/files/tex/e867d7703d90e0f31f91818ccf411ccb0dfddf29.png" alt="$ (x_2, t_2) $" /> lies on the path through space-time that the particle takes, this probability is 1. Otherwise it is 0. In quantum mechanics, we know this is not the case. The probability is the squared modulus of a complex probability amplitude and there are a few different ways to calculate this amplitude which may be non-trivial. Richard Feynman came up with one way that uses a phase factor:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/d9f50f954c130dc60123a2094917b4555c589eca.png" alt="\[<br /> e^{iS[x(t)] / \hbar}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>His scheme requires one to integrate this phase factor over all (square integrable?) functions <img class="teximage" src="/sites/default/files/tex/a0c4b3980243b5a1ba89b8a40d2d23b759b73c9e.png" alt="$ x(t) $" /> satisfying <img class="teximage" src="/sites/default/files/tex/6aad974ba555d9a1ea5dc7d3ec0b0194dd9abd74.png" alt="$ x(t_1) = x_1 $" /> and <img class="teximage" src="/sites/default/files/tex/99cb5b6bd725addcbebb53e0682da78938b97727.png" alt="$ x(t_2) = x_2 $" />. Functions belong to an infinite-dimensional space so this type of integral called a <i>path integral</i> is a little different from what we're used to. Instead of computing a double integral or a triple integral or something like that, this calculation requires us to integrate over <i>infinitely</i> many parameters.</p>
<p>In quantum field theory, the co-ordinate <img class="teximage" src="/sites/default/files/tex/5608b443fa3a443505c3b765ab16b5f68a5ee1ad.png" alt="$ x $" /> should be replaced by a field <img class="teximage" src="/sites/default/files/tex/48634aba4e06eaa56dd536b8676da9ee8abc5be9.png" alt="$ \phi(x) $" /> that represents arbitrarily many particles. This agrees with Einstein's discovery that you can change how many particles you have in a system by converting some of its mass to energy: <img class="teximage" src="/sites/default/files/tex/3d95b3617d37bf621fbb5cb2c186883bc0450c72.png" alt="$ E = mc^2 $" />. If you start with the assumption that you always have one particle, you are throwing out a lot of potential physics. Now we integrate over all time-dependent fields <img class="teximage" src="/sites/default/files/tex/1d131a52977631eb69c833e95e255d81c1560d94.png" alt="$ \phi(x, t) $" /> such that <img class="teximage" src="/sites/default/files/tex/c8523c01ccf1d877cdbb3daf2e3d827651727e65.png" alt="$ \phi(x, t_1) = \phi_1(x) $" /> and <img class="teximage" src="/sites/default/files/tex/c9465e2409a3bd999321614b50dc1108dfdd7240.png" alt="$ \phi(x, t_2) = \phi_2(x) $" />. If our particle is a boson, this is accomplished by writing <img class="teximage" src="/sites/default/files/tex/1d131a52977631eb69c833e95e255d81c1560d94.png" alt="$ \phi(x, t) $" /> as a linear combination of basis functions where each coefficient is a real number. Then we can in principle integrate all of these real numbers from <img class="teximage" src="/sites/default/files/tex/2c2bec46ca0b3afee729886caeccb39dd714ea00.png" alt="$ -\infty $" /> to <img class="teximage" src="/sites/default/files/tex/cd48d834bcc6434e1265d5636d38a8ca2c9400f8.png" alt="$ \infty $" />.</p>
<p>Where am I going with this? Well if we try to describe fermions, the coefficients cannot be real numbers anymore. When we use real numbers, some of the histories we integrate over have many particles in the same state contradicting the Pauli exclusion principle. Instead, physicists have found that we get the right answer if we set the coefficients to anti-commuting numbers called <i>Grassman numbers</i>. <a href="http://motls.blogspot.com/2011/11/celebrating-grassmann-numbers.html">At least</a> <a href="http://motls.blogspot.com/2011/11/could-nature-lhc-prefer-n2.html">three</a> <a href="http://physics.stackexchange.com/questions/5005/velvet-way-to-grassmann-numbers">times</a>, Motl has said that these numbers don't belong to <i>any set</i>, something that I find untrue. The two properties below that must be satisfied are anti-commutativity and associativity respectively:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/9d9738c5de70693c78162c6656ed2218e9724a1c.png" alt="\begin{align*}<br /> \theta_i \theta_j &= -\theta_j \theta_i \\<br /> \theta_i (\theta_j \theta_k) &= (\theta_i \theta_j) \theta_k<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>An immature thing I could say is that if you are even using the symbol <img class="teximage" src="/sites/default/files/tex/5cb4a2f745be76a3822c1d7b0431920821a61185.png" alt="$ = $" />, you must be talking about sets - unless you're trying to rewrite all of mathematics. In fact, Grassman numbers belong to an <i>algebra</i> - a set where elements can be added, multiplied by scalars and multiplied by eachother. There is one subtle language convention that has to be made. Do we use the phrase "Grassman number" to refer to any element of the Grassman algebra or just the generators? Two generators <img class="teximage" src="/sites/default/files/tex/178b600564be8adbd5980ab30a47c91f688a06e7.png" alt="$ \theta_i $" /> and <img class="teximage" src="/sites/default/files/tex/fa67c2d9c3bd12f387e116c8275756b59abcfc03.png" alt="$ \theta_j $" /> will anti-commute but their product <img class="teximage" src="/sites/default/files/tex/bb4643c6742ff6372aa04c9a105c355e5b76dc80.png" alt="$ \theta_i \theta_j $" /> is necessarily another element of the Grassman algebra. This element commutes with a third generator <img class="teximage" src="/sites/default/files/tex/d83f4663c6d2f7cf35cc2ce3f9551c7488b97a08.png" alt="$ \theta_k $" /> so it is clear that not all elements of the algebra anti-commute. I will therefore only use the term "Grassman number" when referring to a generator.</p>
<p>What Luboš Motl meant was that the set containing Grassman numbers is not as concrete or familiar as sets like <img class="teximage" src="/sites/default/files/tex/48c4ab9a8b600bb90ea72290ac94e2f5f6eaea68.png" alt="$ \mathbb{R} $" /> or <img class="teximage" src="/sites/default/files/tex/a43b6190667802e741f10bc2f6e7c51db4cfd4c5.png" alt="$ \mathbb{C} $" />. Is this true? Even people who stopped taking math after high school might remember that matrices are familiar objects that don't always commute. Matrices can be chosen to anti-commute and indeed if you want to represent a Grassman algebra having <img class="teximage" src="/sites/default/files/tex/fe695a9389c9b0c07751fac97facfc61a7830b93.png" alt="$ n $" /> generators with matrices, you can do it as long as your matrices are <img class="teximage" src="/sites/default/files/tex/373cf0b311b09461cf38d87383fc741dd3f0e410.png" alt="$ 2^n $" /> by <img class="teximage" src="/sites/default/files/tex/373cf0b311b09461cf38d87383fc741dd3f0e410.png" alt="$ 2^n $" /> or larger. If we have these concrete representations, why don't physicists just use those and spare people the confusion? It's probably because we have to be able to integrate with respect to a Grassman variable <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" />. Since <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" /> anti-commutes with itself, <img class="teximage" src="/sites/default/files/tex/740906c2bd82ce36bb9b5a9709f6927f4be419ce.png" alt="$ \theta^2 = 0 $" />. This means that in order to integrate an analytic function <img class="teximage" src="/sites/default/files/tex/04bbc1cbce2e76ea7e96ce6a76907e7ff79932c9.png" alt="$ f(x) = a_0 + a_1 x + a_2 x^2 +... $" />, we need to know how to integrate <img class="teximage" src="/sites/default/files/tex/e6252a79d56375ad51d0fd24aad91f4798b00cd5.png" alt="$ 1 $" /> and <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" />. Berezin's rules for doing this are:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/adbde941fc12aa7758235063356a0966bea9ecca.png" alt="\begin{align*}<br /> \int_{\Lambda} \textup{d}\theta &= 0 \\<br /> \int_{\Lambda} \theta \textup{d}\theta &= 1<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This is not what we would get if we tried to naively integrate matrices from <img class="teximage" src="/sites/default/files/tex/2c2bec46ca0b3afee729886caeccb39dd714ea00.png" alt="$ -\infty $" /> to <img class="teximage" src="/sites/default/files/tex/cd48d834bcc6434e1265d5636d38a8ca2c9400f8.png" alt="$ \infty $" />. Now Motl argues that people don't write limits of integration because Grassman numbers don't belong to any set. I doubt it. When a physicist omits integration limits (from a Berezin integral or a real integral) there is one reason for it: he is lazy! If you check the <a href="http://en.wikipedia.org/wiki/Berezin_integral">Wikipedia article</a> (which I have not edited), you will see that they write an integration domain of <img class="teximage" src="/sites/default/files/tex/ac8def6e29e32ae34076317256f84bca99dc0a23.png" alt="$ \Lambda $" /> just like I did above. This <img class="teximage" src="/sites/default/files/tex/ac8def6e29e32ae34076317256f84bca99dc0a23.png" alt="$ \Lambda $" /> is an exterior algebra which means that the elements inside are equivalence classes of anti-symmetric polynomials. It is up to you whether you find this concrete or not. If this exterior algebra happened to be a differentiable manifold then integration on it would be unambiguously defined for any domain - however I am pretty sure that it's not. In this sense he is right that if someone wanted to integrate <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" /> over a proper subset of <img class="teximage" src="/sites/default/files/tex/ac8def6e29e32ae34076317256f84bca99dc0a23.png" alt="$ \Lambda $" /> it would not be clear how to do it.</p>
<p>The operation that sends <img class="teximage" src="/sites/default/files/tex/e6252a79d56375ad51d0fd24aad91f4798b00cd5.png" alt="$ 1 $" /> to <img class="teximage" src="/sites/default/files/tex/605cd3669e47ee5258124d525008357fb243480a.png" alt="$ 0 $" /> and <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" /> to <img class="teximage" src="/sites/default/files/tex/e6252a79d56375ad51d0fd24aad91f4798b00cd5.png" alt="$ 1 $" /> isn't really an integral. The path integral formulation for fermions makes it tempting to think of it as something analogous to the genuine integral that people use for bosons. Indeed, the following identity helps to drive that home:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/1a024ca516bd9aba215935c9777ba620447035f6.png" alt="\begin{align*}<br /> \int_{\Lambda^{2n}} \theta_k \theta^{*}_l \exp \left [ -\sum_{i=1}^{n} \sum_{j=1}^{n} \theta^{*}_i A_{i, j} \theta_j \right ] \prod_{p = 1}^{n} \textup{d}\theta^{*}_p \textup{d}\theta_p &= (A^{-1})_{k, l} \int_{\Lambda^{2n}} \exp \left [ -\sum_{i=1}^{n} \sum_{j=1}^{n} \theta^{*}_i A_{i, j} \theta_j \right ] \prod_{p = 1}^{n} \textup{d}\theta^{*}_p \textup{d}\theta_p \\<br /> \int_{\mathbb{R}^{2n}} x_k x^{*}_l \exp \left [ -\sum_{i=1}^{n} \sum_{j=1}^{n} x^{*}_i A_{i, j} x_j \right ] \prod_{p = 1}^{n} \textup{d}x^{*}_p \textup{d}x_p &= (A^{-1})_{k, l} \int_{\mathbb{R}^{2n}} \exp \left [ -\sum_{i=1}^{n} \sum_{j=1}^{n} x^{*}_i A_{i, j} x_j \right ] \prod_{p = 1}^{n} \textup{d}x^{*}_p \textup{d}x_p<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Here is a new definition of the Grassman numbers that actually allows them to be integrated classically. Consider the following countable set:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/003135632baaa58f3152f16495bb171584f004df.png" alt="\[<br /> A = \left \{ x_p \mapsto \sin(px_p) : p \;\; \mathrm{prime} \right \}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Now Grassman numbers are functions on the unit circle so our integrals will at most run from <img class="teximage" src="/sites/default/files/tex/605cd3669e47ee5258124d525008357fb243480a.png" alt="$ 0 $" /> to <img class="teximage" src="/sites/default/files/tex/bd9c5c9bbb1cf75d0fee80dd7179fd33336a4bff.png" alt="$ \tau = 2\pi $" />. Now if we have a function <img class="teximage" src="/sites/default/files/tex/aa743779b9882380eb9785921090d854e6562255.png" alt="$ f $" /> from <img class="teximage" src="/sites/default/files/tex/4a5d655e734d1a84321d3b4494cfd73782db88db.png" alt="$ \mathbb{R}^n $" /> to <img class="teximage" src="/sites/default/files/tex/48c4ab9a8b600bb90ea72290ac94e2f5f6eaea68.png" alt="$ \mathbb{R} $" />, let <img class="teximage" src="/sites/default/files/tex/279ff5b86dd6fcd57dc1f6dfde5057e83a06ebd3.png" alt="$ Z[f] $" /> return one less than the number of <img class="teximage" src="/sites/default/files/tex/8518168a3cf39bdd9f57ae3621fd0d2ad0e80b7a.png" alt="$ n-1 $" /> dimensional subspaces where the function vanishes. Finally, we may use the following Grassman measure and Grassman product:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/441efaea013e7663ea479166fc4ad40b614e680d.png" alt="\begin{align*}<br /> \textup{d}\theta_p &\equiv \frac{2}{\tau} \sin(px_p) \textup{d}x_p \\<br /> f \star g &\equiv fg \prod_{p | Z[f]} \prod_{q | Z[g]} \textup{sgn}(p-q)<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Let's see if the desired properties hold.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/cd02a9985bb0caa949b6460c85c239ba1ec450c7.png" alt="\begin{align*}<br /> \theta_p \star \theta_q &= \sin(px_p) \star \sin(qx_q) \\<br /> &= \textup{sgn}(p-q) \sin(px_p) \sin(qx_q) \\<br /> &= -\textup{sgn}(q-p) \sin(qx_q) \sin(px_p) \\<br /> &= -\sin(qx_q) \star \sin(px_p) \\<br /> &= -\theta_q \star \theta_p<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Therefore we have anti-commutativity. Let's see if we have associativity.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/df7277295b169191a5e5031579113d9981c15008.png" alt="\begin{align*}<br /> \theta_p \star (\theta_q \star \theta_r) &= \sin(px_p) \star (\sin(qx_q) \star \sin(rx_r)) \\<br /> &= \sin(px_p) \star \textup{sgn}(q-r) \sin(qx_q) \sin(rx_r) \\<br /> &= \textup{sgn}(p-q) \textup{sgn}(p-r) \textup{sgn}(q-r) \sin(px_p) \sin(qx_q) \sin(rx_r) \\<br /> &= \textup{sgn}(p-q) \sin(px_p) \sin(qx_q) \star \sin(rx_r) \\<br /> &= (\sin(px_p) \star \sin(qx_q)) \star \sin(rx_r) \\<br /> &= (\theta_p \star \theta_q) \star \theta_r<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>We have that too. And of course the integral identities work out.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/8cebf8de37ea2d5bc616331bfd86cca38978e38a.png" alt="\begin{align*}<br /> \int_0^{\tau} \textup{d}\theta_p &= \frac{2}{\tau} \int_0^{\tau} \sin(px_p) \textup{d}x_p = 0 \\<br /> \int_0^{\tau} \theta_p \textup{d}\theta_p &= \frac{2}{\tau} \int_0^{\tau} \sin^2(px_p) \textup{d}x_p = 1<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This allows us to integrate any analytic function with the understanding that <img class="teximage" src="/sites/default/files/tex/8f4317970a6a4472fd25afcdf822459076628048.png" alt="$ a_0 + a_1 x + a_2 x^2 +... $" /> becomes <img class="teximage" src="/sites/default/files/tex/65a4cec323be4307c63208bf70a0a3b1dafb62dd.png" alt="$ a_0 + a_1 x + a_2 x \star x + ... $" />. Now I can make the previously unknown claim that if I integrate <img class="teximage" src="/sites/default/files/tex/f96a84de598a1e7e262ff2715fd9ea84bf5b7951.png" alt="$ \theta_p $" /> between 0 and 0.587, the number I will get is 0.063. You might say this is all very <i>ad-hoc</i>. I just made up some definition involving "signs and sines" that agrees with the Berezin integral in the limiting cases and is completely unnecessary in all other cases. But the entire premise of using Grassman numbers in a path integral is also <i>ad-hoc</i>. Physics researchers were rightly upset that path integrals seemed like a nice object for describing bosons but not fermions. They made up definitions that would force the path integrals to work for fermions as well even though there were <a href="http://www.smallperturbation.com/physics-proof">other perfectly good methods</a> for doing the same calculations!</p>
<p>I'm also not the only person who has tried to redefine the Berezin integral. <a href="http://iopscience.iop.org/0305-4470/25/7/033">Brzezinski and Rembielinski</a> came up with a different definition and they make it seem like my idea to integrate between <img class="teximage" src="/sites/default/files/tex/605cd3669e47ee5258124d525008357fb243480a.png" alt="$ 0 $" /> and <img class="teximage" src="/sites/default/files/tex/296bde1c0683ea548f35b48d0655bb27f1c47c2b.png" alt="$ \tau $" /> was not far off. Their family of integrals labelled by <img class="teximage" src="/sites/default/files/tex/97f62452d5b15c9121e44b2e25bd35f7252d16a7.png" alt="$ p $" /> behave as:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5ee5dbab956f28f68ba27a0c7885fa2c5ee0b05a.png" alt="\begin{align*}<br /> \int_{\alpha}^{\beta} \zeta^n \textup{d}\zeta &= (\beta^{n+1}-\alpha^{n+1}) \frac{1}{1+ \dots +p^n} \left ( \frac{p+1}{2 \sqrt{|p|}} \right )^{\frac{n+1}{2}} \\<br /> \int_{\alpha}^{\beta} \zeta^{*n} \textup{d}\zeta^{*} &= (\beta^{n+1}-\alpha^{n+1}) \frac{p^n}{1+ \dots +p^n} \left ( \frac{p+1}{2 \sqrt{|p|}} \right )^{\frac{n+1}{2}}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>They point out that the Riemann integral is the case <img class="teximage" src="/sites/default/files/tex/b7e14b15ad4546a3c25c435ee83b811fbf8bdd28.png" alt="$ p = 1 $" />. If we take the limit as <img class="teximage" src="/sites/default/files/tex/1937fa401e2161a459c5a0f3b5152a08d8eff29a.png" alt="$ p \rightarrow -1 $" />, we get an integral that vanishes whenever <img class="teximage" src="/sites/default/files/tex/75697f3366b953e657fc57362c2cbc51255511ca.png" alt="$ n \neq 1 $" /> just like the Berezin integral. If we want to normalize the one integral that doesn't vanish, we actually need to integrate from <img class="teximage" src="/sites/default/files/tex/605cd3669e47ee5258124d525008357fb243480a.png" alt="$ 0 $" /> to <img class="teximage" src="/sites/default/files/tex/f01ec285b5833ed7c4400135fcda617529b792a3.png" alt="$ \sqrt{2} $" />. <a href="http://www.sciencedirect.com/science/article/pii/0167278985901484">One paper that I couldn't manage to download</a> proposes yet another definition that allows finite limits. I don't know what their contour integral is but I'm sure it's a well defined element of a well defined set!</p>
<p>Having said all this, I agree with the Bohm bashing at the end of the essay.</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/13">protest</a></div><div class="field-item odd"><a href="/taxonomy/term/15">fun calculations</a></div></div></div>Sat, 28 Jan 2012 09:05:22 +0000root25 at https://smallperturbation.comThose Physicists And Their "Physics Proofs"
https://smallperturbation.com/physics-proof
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>There are plenty of cases where a proof written down by a physicist is worse than a proof written down by a mathematician, but this is a particularly bad one. In one of my courses, we got to derive the <a href="http://en.wikipedia.org/wiki/Gamma_matrices">Dirac matrices</a>, which are instrumental in describing <a href="http://en.wikipedia.org/wiki/Spin_%28physics%29">spin 1/2</a> particles. These four matrices are written as <img class="teximage" src="/sites/default/files/tex/32290fd58e1597dffa8cec1679a87c9719b54279.png" alt="$ \gamma $" /> with an index. One definition of them says that they should satisfy the anti-commutation relations of the <a href="http://en.wikipedia.org/wiki/Clifford_algebra">Clifford algebra</a>:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/07715676691314b849ab17ed1c665550d3ff6954.png" alt="\[<br /> \left \{ \gamma^{\mu}, \gamma^{\nu} \right \} \equiv \gamma^{\mu}\gamma^{\nu} + \gamma^{\nu}\gamma^{\mu} = 2 \eta^{\mu\nu} I<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where <img class="teximage" src="/sites/default/files/tex/c03765255a626614dc5b27b793dcdcad288d3ee2.png" alt="$ \eta $" /> is the <a href="http://en.wikipedia.org/wiki/Minkowski_space">Minkowski metric</a> from special relativity.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/cc398a09e9aaf8f99f4b401fcc945cfc09674962.png" alt="\[<br /> \eta = \left [<br /> \begin{tabular}{cccc}<br /> 1 & 0 & 0 & 0 \\<br /> 0 & -1 & 0 & 0 \\<br /> 0 & 0 & -1 & 0 \\<br /> 0 & 0 & 0 & -1<br /> \end{tabular}<br /> \right ]<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>How big do our matrices have to be in order to satisfy this? They obviously cannot be 1x1 matrices because these are just numbers that commute. It turns out that they have to be at least 4x4 but all published sources I have seen fail at explaining why. I will go through the physics proof that is often given and then set the record straight by writing a real proof. If it appears nowhere else, let it appear here!</p>
<!--break--><h2>Incomplete proof</h2>
<p>I will depart from the convention of calling the matrices <img class="teximage" src="/sites/default/files/tex/00a6d4e8dcfe5815b905cb2e6cb0be9379c71fab.png" alt="$ \gamma^0, \gamma^1, \gamma^2 $" /> and <img class="teximage" src="/sites/default/files/tex/c9717fe81be322b5a4e1b91d8d0dbe7a16433ac4.png" alt="$ \gamma^3 $" />. For some reason I like <img class="teximage" src="/sites/default/files/tex/ab17b36f3d7654359b5f2b31be24f38d68b17794.png" alt="$ \gamma^t, \gamma^x, \gamma^y $" /> and <img class="teximage" src="/sites/default/files/tex/c2cbefcc94ee7b8b58f2e767153481ed7d582731.png" alt="$ \gamma^z $" /> better. The relations above basically say that:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/2f84edabdbd4936d35b6647f90d9ea9bfae72ae2.png" alt="\[<br /> \left ( \gamma^t \right )^2 = I, \;\;\; \left ( \gamma^i \right )^2 = -I<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>and distinct Dirac matrices anti-commute. If we look at the equation <img class="teximage" src="/sites/default/files/tex/9bcf332e7ded4087fd1f0b0c245224fbb7f672e9.png" alt="$ \gamma^{\mu}\gamma^{\nu} = - \gamma^{\nu}\gamma^{\mu} $" /> and take the determinant of both sides, we get: <img class="teximage" src="/sites/default/files/tex/015b381929d0c222332181183d9646f85d1df22e.png" alt="$ \left ( \textup{Det}\gamma^{\mu} \right ) \left ( \textup{Det} \gamma^{\nu} \right ) = (-1)^n \left ( \textup{Det}\gamma^{\nu} \right ) \left ( \textup{Det} \gamma^{\mu} \right ) $" />. If something is equal to <img class="teximage" src="/sites/default/files/tex/8a6caa29cebd8e28fb492a61f2024107a6bd2edc.png" alt="$ (-1)^n $" /> times itself, <img class="teximage" src="/sites/default/files/tex/fe695a9389c9b0c07751fac97facfc61a7830b93.png" alt="$ n $" /> must be even. This rules out 3x3 Dirac matrices and the question becomes <i>why can't we represent the Clifford algebra with 2x2 matrices?</i>. Most physics textbooks seem to be okay with this part of the proof.</p>
<p>Some people say that the largest possible set of anti-commuting 2x2 matrices has only three elements. Is this supposed to be easy to show? Is the maximal anti-commuting set known for matrices of any size? There is <a href="http://jlms.oxfordjournals.org/content/s1-7/1/58.full.pdf">a paper</a> about that from 1932. It is 11 pages and only proves the 4x4 case so I highly doubt it. Anyway, here is how <a href="http://books.google.ca/books?id=LmkuMMvZkdQC&pg=PA13#v=onepage&q&f=false">other</a> <a href="http://books.google.ca/books?id=BvDP6_D20ekC&pg=PA314#v=onepage&q&f=false">sources</a> proceed to "prove" this result:</p>
<p>We know that the three <a href="http://en.wikipedia.org/wiki/Pauli_matrices">Pauli matrices</a> anti-commute so let three of our Dirac matrices be Pauli matrices. Also, if we take the three Pauli matrices and adjoin the identity, we get a basis for the vector space of 2x2 matrices. Therefore our fourth Dirac matrix must be expressed as:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/bdc5a97c21dd21e56d0086e64ad01097279f1a76.png" alt="\[<br /> M = a_t I + a_x \sigma_x + a_y \sigma_y + a_z \sigma_z<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Since the Pauli matrices anti-commute, the product of distinct Pauli matrices will be traceless (as is a single Pauli matrix). The trace of a squared Pauli matrix is 2. Therefore by linearity, <img class="teximage" src="/sites/default/files/tex/842463101ab81fff398cf6b73a6f2caf15617158.png" alt="$ \textup{Tr} (M \sigma_p) = 2a_p $" />. However, <img class="teximage" src="/sites/default/files/tex/d080ccd35389802fb0536fcfb771b93d4da62129.png" alt="$ M $" /> also has to anti-commute with <img class="teximage" src="/sites/default/files/tex/285d38365e92cb99e0727b544ec5dd312844edf1.png" alt="$ \sigma_p $" /> meaning that <img class="teximage" src="/sites/default/files/tex/8bf0ad3dfd3388a7bab2c4d48520b0ce4a73dd29.png" alt="$ M \sigma_p $" /> should be traceless. This forces <img class="teximage" src="/sites/default/files/tex/bf917f04ce49997c044f6e98c4b2647ea811df87.png" alt="$ a_x, a_y $" /> and <img class="teximage" src="/sites/default/files/tex/5d491451564840929d02d0b0bf948f43bb13d249.png" alt="$ a_z $" /> to all be zero meaning <img class="teximage" src="/sites/default/files/tex/d080ccd35389802fb0536fcfb771b93d4da62129.png" alt="$ M $" /> is the identity. The identity commutes with every matrix so the fourth matrix we set out to find doesn't exist.</p>
<p>This works if you restrict yourself to a ridiculously special case but who ever said that three of the four anti-commuting matrices should be Pauli matrices? Maybe if you start off with a different set of three anti-commuting matrices there suddenly will be room for a fourth. The proof above would only be complete if it cited some theorem that this never happens. Since I am not aware of such a theorem, I will split our search into two cases and show that in each case we can only find three matrices with the desired properties, not four.</p>
<h2>Complete proof</h2>
<p>Notice that the equations defining our Dirac matrices are invariant under similarity transformations. If <img class="teximage" src="/sites/default/files/tex/3aee780581ae1f0ef92ae037074e983a94cad352.png" alt="$ V $" /> is an invertible matrix,</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/bd0f0a9c65240a61e284118a1c3325da6d509614.png" alt="\begin{align*}<br /> \left \{ V \gamma^{\mu} V^{-1}, V \gamma^{\nu} V^{-1} \right \} &= V \gamma^{\mu}\gamma^{\nu} V^{-1} + V \gamma^{\nu}\gamma^{\mu} V^{-1} \\<br /> &= V \left \{ \gamma^{\mu}, \gamma^{\nu} \right \} V^{-1} \\<br /> &= 2 V \eta^{\mu\nu} I V^{-1} \\<br /> &= 2 \eta^{\mu\nu} I<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>so without loss of generality, we can assume that <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> is in Jordan canonical form. Case 1: assume that <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> is diagonalizable. You get the identity by squaring <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> so the diagonal entries in it can only be <img class="teximage" src="/sites/default/files/tex/a946a762ab3ea0ae165780aa47333c6d0ac3030d.png" alt="$ \pm 1 $" />. If both diagonal entries had the same sign, we would be left with a matrix that commutes with everything. Therefore in this case, <img class="teximage" src="/sites/default/files/tex/2e54a6991df7fe1b3c286ac182a951bbb165886c.png" alt="$ \gamma^t = \sigma_z $" />. Denote the components of <img class="teximage" src="/sites/default/files/tex/599e14e8d164fea74f62efe2358057e8177faa52.png" alt="$ \gamma^x $" /> by <img class="teximage" src="/sites/default/files/tex/3aa7d4ac62d8b69cb25438b445b08f66683b5f1e.png" alt="$ a, b, c $" /> and <img class="teximage" src="/sites/default/files/tex/b4eeb9182b7ca2a2357b3176dea86b9c0a94ff3a.png" alt="$ d $" />. The fact that <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> anti-commutes with <img class="teximage" src="/sites/default/files/tex/599e14e8d164fea74f62efe2358057e8177faa52.png" alt="$ \gamma^x $" /> says that:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/2de02e02e9c468f879b44680397f0ca7fcf69e33.png" alt="\begin{align*}<br /> \left [<br /> \begin{tabular}{cc}<br /> 1 & 0 \\<br /> 0 & -1<br /> \end{tabular}<br /> \right ] \left [<br /> \begin{tabular}{cc}<br /> a & b \\<br /> c & d<br /> \end{tabular}<br /> \right ] &= - \left [<br /> \begin{tabular}{cc}<br /> a & b \\<br /> c & d<br /> \end{tabular}<br /> \right ] \left [<br /> \begin{tabular}{cc}<br /> 1 & 0 \\<br /> 0 & -1<br /> \end{tabular}<br /> \right ] \\<br /> \left [<br /> \begin{tabular}{cc}<br /> a & b \\<br /> -c & -d<br /> \end{tabular}<br /> \right ] &= \left [<br /> \begin{tabular}{cc}<br /> a & -b \\<br /> c & -d<br /> \end{tabular}<br /> \right ]<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>In other words, <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> being diagonal forces the spatial Dirac matrices to be anti-diagonal. Now we will let <img class="teximage" src="/sites/default/files/tex/7613804b98fc2891a55890584b713572a75211d6.png" alt="$ \gamma^i $" /> have the entry <img class="teximage" src="/sites/default/files/tex/39126b1a236f0ee0a8c09b0b2cd3316efce0e7ff.png" alt="$ b_i $" /> in the upper right and <img class="teximage" src="/sites/default/files/tex/8f5c9ff2c615bb9861f1a5aa10db922e612f780a.png" alt="$ c_i $" /> in the lower left. If these matrices all anti-commute, the equations that this gives us are:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/784e99d0cf16179fbd8f2e22547812103b569199.png" alt="\begin{align*}<br /> b_x c_y = - c_x b_y \\<br /> b_y c_z = - c_y b_z \\<br /> b_x c_z = - c_x b_z \\<br /> b_x c_x = b_y c_y = b_z c_z = -1<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where the last equation comes from squaring the spatial Dirac matrices. Multiply the first equation by <img class="teximage" src="/sites/default/files/tex/f7ca7a98a0f9be0862568025828b6e23833a7acb.png" alt="$ c_x $" />. This gives <img class="teximage" src="/sites/default/files/tex/014f0636a82aab314c4adc4ee376b6dd332bd904.png" alt="$ c_y = c_x^2 b_y $" />. If we substitute this into the second equation we get <img class="teximage" src="/sites/default/files/tex/0b286bda7a6034a855576c9748a4a6f231b3cf5b.png" alt="$ c_z = -c_x^2 b_z $" />. Now substituting this into the third equation, <img class="teximage" src="/sites/default/files/tex/1c37d0b5b53b96c2b0ec1d201a8d832ba6a4c1e5.png" alt="$ b_x c_x $" /> comes out to <img class="teximage" src="/sites/default/files/tex/e6252a79d56375ad51d0fd24aad91f4798b00cd5.png" alt="$ 1 $" />, contradicting the last equation. Therefore starting with a diagonal <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> leads to a contradiction.</p>
<p>Case 2: if <img class="teximage" src="/sites/default/files/tex/e86af0396a47a814961d9dcd2cf2dd7ddd8285b5.png" alt="$ \gamma^t $" /> is not diagonalizable, its Jordan form is:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5d6f2d53f0b85444dba7a2ee209ac557f1585292.png" alt="\[<br /> \left [<br /> \begin{tabular}{cc}<br /> \lambda & 1 \\<br /> 0 & \lambda<br /> \end{tabular}<br /> \right ]<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>However, the square of this matrix has <img class="teximage" src="/sites/default/files/tex/f8f11042a63679d59c7b8f252e393999f18c5aba.png" alt="$ 2 \lambda $" /> in an off-diagonal entry. We know <img class="teximage" src="/sites/default/files/tex/199ab7c82f212e7e7cabb92e1b71929f5c525679.png" alt="$ \left ( \gamma^t \right )^2 $" /> is diagonal so it is necessary to have <img class="teximage" src="/sites/default/files/tex/6e8f676b60d3b11da2759332812eb06af1091d1e.png" alt="$ \lambda = 0 $" /> but this is not sufficient to give us <img class="teximage" src="/sites/default/files/tex/a443426ba07a2dd20c312aa93aeff004cb61daf1.png" alt="$ \left ( \gamma^t \right )^2 = I $" />. The square of the above matrix with <img class="teximage" src="/sites/default/files/tex/6e8f676b60d3b11da2759332812eb06af1091d1e.png" alt="$ \lambda = 0 $" /> will be the zero matrix, not the identity. Therefore this 2x2 Dirac matrix assumption is a contradiction too and we must use matrices that are 4x4 or larger.</p>
<p>This completes the proof without automatically assuming that everything is a Pauli matrix. It is worth noting however that the Dirac matrices can be expressed quite nicely in terms of the Pauli matrices. It is easy to check that the following expressions satisfy the Clifford algebra relations:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/f44c4583e7812af1e06ca3ae5747bfd258caa803.png" alt="\[<br /> \gamma^t = \left [<br /> \begin{tabular}{cc}<br /> 0 & I \\<br /> I & 0<br /> \end{tabular}<br /> \right ], \;\;\; \gamma^i = \left [<br /> \begin{tabular}{cc}<br /> 0 & \sigma_i \\<br /> \sigma_i & 0<br /> \end{tabular}<br /> \right ]<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>This close resemblance explains why I want to treat the Dirac and Pauli matrices symmetrically. I learned about the Pauli matrices first which were subscripted using x, y, z instead of 1, 2, 3. This is why I reject the idea of using numbers instead of letters on the Dirac matrices. I also refuse to call them "gamma matrices" because no one ever used "sigma matrix" to refer to a Pauli matrix. No I will not dare to compare myself to Dirac even though he used his own "symmetry conventions" to decide on terminology. Oh wait, I just did.</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/13">protest</a></div><div class="field-item odd"><a href="/taxonomy/term/15">fun calculations</a></div></div></div>Thu, 15 Dec 2011 09:07:19 +0000root23 at https://smallperturbation.comMaking Circuits
https://smallperturbation.com/making-circuits
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>As part of a summer internship, I got to put together several electronic components, and for the first time, use something more permanent than a <a href="http://en.wikipedia.org/wiki/Breadboard">breadboard</a>. I found my first circuit very frustrating because my solder connections kept coming loose and I was told to make it as small as possible. But seriously... am I so used to learning about algebraic varieties and Feynman diagrams that I have become allergic to learning a real transferable skill?</p>
<p>The need for my first circuit arose because the photodiode that we used to measure the power in various lasers had a proportionality constant that was too small. For every Watt of power, the diode was calibrated to put only <img class="teximage" src="/sites/default/files/tex/83c5d16b9a82b0a98343b0168cc983a9ef59be1c.png" alt="$ 0.5 \textup{mV} $" /> across two pins. We wanted to amplify this to a larger value. The component typically used for these applications that you can buy off the shelf is the <a href="http://en.wikipedia.org/wiki/Op-amp">operational amplifier</a>.</p>
<!--break--><p>Even though it seems like there are thousands of parts that show up in electrical circuits, they are really all made of the same four fundamental components: resistors, capacitors, inductors and the recently built <a href="http://en.wikipedia.org/wiki/Memristor">memristors</a>. Operational amplifiers are chips that house a bunch of these smaller building blocks inside. Op-amps as they are affectionately called are drawn as triangles with two input pins and one output pin. The input pins are called the <i>inverting terminal</i>, denoted by a <img class="teximage" src="/sites/default/files/tex/211ffd94088454cd467f9a7c463d619c2c7c7568.png" alt="$ - $" /> sign and the <i>non-inverting terminal</i>, denoted by a <img class="teximage" src="/sites/default/files/tex/984f6061b9e2de3ae7c40c1a67a8618b9f63eb46.png" alt="$ + $" /> sign. By itself, an op-amp produces an output voltage that is many times larger than the voltage difference between the two terminals.</p>
<p><img src="/sites/default/files/post_images/2011-09-11_op_amp.png" /></p>
<p>Knowing this, how is building the circuit that we plan on using with this photodiode difficult at all? Can't we just use an op-amp and call it a day? That would almost certainly fail because the gain of an op-amp used in this configuration (the open-loop gain) can easily exceed one million. <img class="teximage" src="/sites/default/files/tex/83c5d16b9a82b0a98343b0168cc983a9ef59be1c.png" alt="$ 0.5 \textup{mV} $" /> becoming <img class="teximage" src="/sites/default/files/tex/936ddb00c3b244f92e239de7e9c0303c33e51e10.png" alt="$ 500 \textup{V} $" />? That would be very unreasonable. In useful circuits, op-amps are always kept under control using <i>negative feedback</i>.</p>
<p><img src="/sites/default/files/post_images/2011-09-11_buffer.png" /></p>
<p>This is the principle behind the simple buffer shown above. If <img class="teximage" src="/sites/default/files/tex/b79d7e708ee4750b01911ff8787382a1e2d9514d.png" alt="$ V_{\textup{in}} $" /> is initially <img class="teximage" src="/sites/default/files/tex/4ffa2a414638c22d766fb2449fb7e3b2f7e5d9b8.png" alt="$ 5 \textup{V} $" /> and <img class="teximage" src="/sites/default/files/tex/ad7e6f1830ff2f447291599d308c1aabab82f15d.png" alt="$ V_{\textup{out}} $" /> is initially zero, the op-amp will begin ramping up <img class="teximage" src="/sites/default/files/tex/ad7e6f1830ff2f447291599d308c1aabab82f15d.png" alt="$ V_{\textup{out}} $" /> toward its target value of <img class="teximage" src="/sites/default/files/tex/764fd334f012842e6fb3d2f96c4202fb06f10c12.png" alt="$ 5 \textup{MV} $" />. However, it should be obvious that it never gets there. As soon as <img class="teximage" src="/sites/default/files/tex/ad7e6f1830ff2f447291599d308c1aabab82f15d.png" alt="$ V_{\textup{out}} $" /> climbs ever so slightly above zero, the voltage between the two terminals will become smaller and the target for which the op-amp aims will drop to something like <img class="teximage" src="/sites/default/files/tex/c627e893619754bb3afae74da533aa914945ac4c.png" alt="$ 4.9 \textup{MV} $" />. As this process continues, equilibrium will be established when the inverting and non-inverting terminals have the same voltage (well, different by a factor of <img class="teximage" src="/sites/default/files/tex/b6fb90e3320ee7b9a39ce2a9db0913fe51306d20.png" alt="$ 1 + 10^{-6} $" />) - a fact that we will use in just a moment. For the buffer, this means that <img class="teximage" src="/sites/default/files/tex/4a5903eeade15f39c9b4fb29a8bbca2b172d270c.png" alt="$ V_{\textup{in}} = V_{\textup{out}} $" /> in the steady-state. The purpose of a buffer is to read the signal generated by some voltage source and output a signal that has the exact same voltage but a different current. It is often the case that a waveform used to automate some equipment lacks the current required to do real work like driving a motor. A buffer based on a high-current op-amp comes in handy by boosting the current.</p>
<p><img src="/sites/default/files/post_images/2011-09-11_diff_amp.png" /></p>
<p>The circuit that I set out to make is the <i>differential amplifier</i> shown above. I found that the required calculations were easy to do if I split the circuit into two parts.</p>
<table>
<tr>
<td>
<img src="/sites/default/files/post_images/2011-09-11_left_side.png" />
</td>
<td>
<img src="/sites/default/files/post_images/2011-09-11_right_side.png" />
</td>
</tr>
</table>
<p>First look at the left side. We know that the voltage drop between the red dot and the ground arrow is <img class="teximage" src="/sites/default/files/tex/0774162772b2b56487825bcebdadcfc6fa92d799.png" alt="$ V_2 $" />. Since the total resistance along this path is <img class="teximage" src="/sites/default/files/tex/870aabef02cfb143e0ef2d0edef10d3b6ba3a1a2.png" alt="$ R_2 + R_{\textup{g}} $" />, a current of <img class="teximage" src="/sites/default/files/tex/b9a819678cb97b1add8cc49d6279a226991e3f65.png" alt="$ \frac{V_2}{R_2 + R_{\textup{g}}} $" /> flows by Ohm's law. The voltage at the terminal (the potential difference between the blue dot and the ground) is found by multiplying this current: <img class="teximage" src="/sites/default/files/tex/8bc77d49595e5d2bfbd5e26d5f31a9fe478c5e53.png" alt="$ \frac{V_2 R_{\textup{g}}}{R_2 + R_{\textup{g}}} $" />. This piece of information carries over to the right side. On the right side, the voltage between the blue dot and ground is the same: <img class="teximage" src="/sites/default/files/tex/8bc77d49595e5d2bfbd5e26d5f31a9fe478c5e53.png" alt="$ \frac{V_2 R_{\textup{g}}}{R_2 + R_{\textup{g}}} $" />. We also know that the voltage between the red dot and ground is simply <img class="teximage" src="/sites/default/files/tex/4173a8d93ae28447c7508aac077cd2b608a54a6c.png" alt="$ V_1 $" />. Subtracting the two, we see that the voltage drop across <img class="teximage" src="/sites/default/files/tex/303f78d927d4f10c402b5fdc49bbc1cd2403a000.png" alt="$ R_1 $" /> is <img class="teximage" src="/sites/default/files/tex/dc6e8df9957b4afe906d97112cb72955cd25dbf3.png" alt="$ V_1 - \frac{V_2 R_{\textup{g}}}{R_2 + R_{\textup{g}}} $" />, giving us a current of <img class="teximage" src="/sites/default/files/tex/f2d8edaa2d784d4ce65f868bbedd473c084b5e03.png" alt="$ \frac{V_1}{R_1} - \frac{V_2 R_{\textup{g}}}{R_1(R_2 + R_{\textup{g}})} $" />. Now <img class="teximage" src="/sites/default/files/tex/ad7e6f1830ff2f447291599d308c1aabab82f15d.png" alt="$ V_{\textup{out}} $" />, the voltage between the green dot and the ground is the voltage between the blue dot and the ground minus the voltage between the blue and green dots. To get this blue to green voltage, we multiply the current by <img class="teximage" src="/sites/default/files/tex/49109e64b14b9704cfbdd50ad41bcfdd71f2597a.png" alt="$ R_{\textup{f}} $" />. We therefore have:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/3f25029c48985b522c15fae687264ca70edfa017.png" alt="\begin{align*}<br /> V_{\textup{out}} &= \frac{V_2 R_{\textup{g}}}{R_2 + R_{\textup{g}}} - R_{\textup{f}} I \\<br /> &= \frac{V_2 R_{\textup{g}}}{R_2 + R_{\textup{g}}} - R_{\textup{f}} \left [ \frac{V_1}{R_1} - \frac{V_2 R_{\textup{g}}}{R_1(R_2 + R_{\textup{g}})} \right ] \\<br /> &= \frac{V_2 R_{\textup{g}} (R_1 + R_{\textup{f}})}{R_1(R_2 + R_{\textup{g}})} - \frac{V_1 R_{\textup{f}}}{R_1}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>We have now derived the formula for the output of a differential amplifier. To make the circuit linear, we simply let <img class="teximage" src="/sites/default/files/tex/69fc14aeeadd3b641075641ec06994dd1ea15116.png" alt="$ R_2 = R_1 $" /> and <img class="teximage" src="/sites/default/files/tex/c1e95716ac2da21ea8b13bc1f4f34779a7520422.png" alt="$ R_{\textup{g}} = R_{\textup{f}} $" />. Our formula then simplifies to</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/244d1b704a492b0bd13f9be8251a664be26200e4.png" alt="\[<br /> V_{\textup{out}} = \frac{R_{\textup{f}}}{R_1} (V_2 - V_1)<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>So I made this type of circuit and set it up to have a gain of 200 by choosing <img class="teximage" src="/sites/default/files/tex/49109e64b14b9704cfbdd50ad41bcfdd71f2597a.png" alt="$ R_{\textup{f}} $" /> to be <img class="teximage" src="/sites/default/files/tex/cf206df8c2256ffeeca0feffb2d8976f4ab404ca.png" alt="$ 20 \textup{k}\Omega $" /> and <img class="teximage" src="/sites/default/files/tex/303f78d927d4f10c402b5fdc49bbc1cd2403a000.png" alt="$ R_1 $" /> to be <img class="teximage" src="/sites/default/files/tex/d84e65a9b2a008811ea7946331b7936d26f7b49a.png" alt="$ 100 \Omega $" />. I also powered the op-amp with a battery by connecting the negative end of the battery to the negative supply pin and the positive end of the battery to the positive supply pin. Unfortunately, I was faced with two problems. The first problem happened when I tested the circuit using a power supply instead of the photodiode. Large signals like <img class="teximage" src="/sites/default/files/tex/fcec7060dcb38343039ed07c9fd93c4c9d6294a5.png" alt="$ V_2 - V_1 = 10 \textup{mV} $" /> were correctly amplified to <img class="teximage" src="/sites/default/files/tex/397950cac8ce9f06a0064d9f4978aa4bac067b7c.png" alt="$ 2 \textup{V} $" /> but if I shorted the pins and made <img class="teximage" src="/sites/default/files/tex/e0ef1fcbf8e686bf3e8e564eeb1b2fab9b8fd744.png" alt="$ V_2 - V_1 $" /> equal to zero Volts, I still saw an output of about <img class="teximage" src="/sites/default/files/tex/faf1fad43aafd48d088f89c9af9383612fec2c99.png" alt="$ 0.5 \textup{V} $" />. A guru reading this can probably already see the rookie mistake that I made but it has to do with one of the most infamous words in electronics: ground!</p>
<p>The supply pins that an op-amp uses (not shown in typical triangle diagrams) accept a positive and a negative voltage. However, the voltage of one pin has no meaning. Electric potential is only defined when you're talking about the difference between two locations in space. So if one is positive and one is negative, that to me sounds like we just need one to be higher than the other. However, there is an implicit ground in the circuit even if we aren't doing anything fancy like connecting it to the Earth. The ground comes from the fact that the output <img class="teximage" src="/sites/default/files/tex/ad7e6f1830ff2f447291599d308c1aabab82f15d.png" alt="$ V_{\textup{out}} $" /> is being measured relative to some other point - which I chose to be the negative terminal of the battery. If I am using the negative terminal of the battery to mean <img class="teximage" src="/sites/default/files/tex/47b69ad94cb8604b7aff11757bd65ce350f8f4bb.png" alt="$ 0 \textup{V} $" />, I am bound to run into trouble since the negative supply pin is connected to this <img class="teximage" src="/sites/default/files/tex/47b69ad94cb8604b7aff11757bd65ce350f8f4bb.png" alt="$ 0 \textup{V} $" /> source as well and not something lower. Op-amps that demand positive and negative voltage in this way are called "split-supply" op-amps. There are "single-supply" op-amps as well, but I wasn't using one. This had the effect of ruining the linearity of the circuit. If one of the "rails" has a voltage of <img class="teximage" src="/sites/default/files/tex/47b69ad94cb8604b7aff11757bd65ce350f8f4bb.png" alt="$ 0 \textup{V} $" /> when the op-amp expects it to be lower, it will fail if one tries to bring the output too close to the rail. I solved this problem by dividing the voltage of my <img class="teximage" src="/sites/default/files/tex/fe5d94bf493564b6c232b90889c23d3898495a69.png" alt="$ 9 \textup{V} $" /> battery in half with two resistors and using the rail in the middle as my ground. This way the supply pins had voltages of <img class="teximage" src="/sites/default/files/tex/4ee14fee5fd9bfc47d70df35095b628484282748.png" alt="$ \pm 4.5 \textup{V} $" /> and everything started working. I could have also supplied <img class="teximage" src="/sites/default/files/tex/de0d5e6205cb01e1b0da482fb9adc539cd8626c6.png" alt="$ \pm 9 \textup{V} $" /> if I had used two batteries.</p>
<p>I ran into a second problem when I tried to use this circuit with the photodiode - the photodiode was failing to provide a <img class="teximage" src="/sites/default/files/tex/83c5d16b9a82b0a98343b0168cc983a9ef59be1c.png" alt="$ 0.5 \textup{mV} $" /> signal for a <img class="teximage" src="/sites/default/files/tex/fb77a4dbe6980603c400019a1c6b9a8ef189ff77.png" alt="$ 1 \textup{W} $" /> laser. This happened because my circuit did not have a high enough resistance. The current required to put a known voltage across a resistor is inversely proportional to the resistance, by Ohm's joke of a law. Using small resistors may be too strenuous for the photodiode which cannot supply very much current. To solve this problem, I kept the gain at 200 by increasing all my resistors by the same factor. <img class="teximage" src="/sites/default/files/tex/49109e64b14b9704cfbdd50ad41bcfdd71f2597a.png" alt="$ R_{\textup{f}} $" /> became <img class="teximage" src="/sites/default/files/tex/27f003df3b8513ab046a208cbc266f45e15e5d1c.png" alt="$ 2 \textup{M}\Omega $" /> and <img class="teximage" src="/sites/default/files/tex/303f78d927d4f10c402b5fdc49bbc1cd2403a000.png" alt="$ R_1 $" /> became <img class="teximage" src="/sites/default/files/tex/25f31469e25b12a5f46baa63b8dfdbd3e5b3ac71.png" alt="$ 10 \textup{k}\Omega $" />. Another way to solve it would be to use two more op-amps and buffer each input using the simple buffer shown at the top of this post. This would turn the differential amplifier into what's called an instrumentation amplifier.</p>
<p>But something else that an electronics guru should see is that I am wasting effort. I could've used a simpler circuit called an inverting amplifier to get the same result. To do this, we simply force <img class="teximage" src="/sites/default/files/tex/0774162772b2b56487825bcebdadcfc6fa92d799.png" alt="$ V_2 $" /> to be zero allowing us to take out the <img class="teximage" src="/sites/default/files/tex/f5cd57e95071bf2e7884454baf6e33f3fea66935.png" alt="$ R_2 $" /> and <img class="teximage" src="/sites/default/files/tex/e0b40a0bd5ac5fdce30eb25babfb52a853e6864d.png" alt="$ R_{\textup{g}} $" /> resistors. This turns our formula into:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/d5472cf9c060a16687006670c2f404f233ab71c5.png" alt="\[<br /> V_{\textup{out}} = -V_1 \frac{R_{\textup{f}}}{R_1}<br /> \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Now, we just have to make sure that <img class="teximage" src="/sites/default/files/tex/4173a8d93ae28447c7508aac077cd2b608a54a6c.png" alt="$ V_1 $" /> relative to ground is equal to the potential difference between the two pins of the photodiode. How do we do this? Just hook up one pin of the photodiode to the inverting terminal and ground the other pin. This would have been easy to do... I made my life too complicated!</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item odd"><a href="/taxonomy/term/20">electronics</a></div></div></div>Sun, 11 Sep 2011 08:11:08 +0000root18 at https://smallperturbation.comIt's Getting Hot In Here
https://smallperturbation.com/temperature-flux
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>In a <a href="http://www.smallperturbation.com/sunrise-equation">recent post</a>, we used trigonometry to derive the length of a day on the Earth as a function of the observer's latitude and the time of year. As promised, I want to continue modelling the Earth's orbit to see what it can tell us about temperature. The simplest explanation for the seasonal variation of temperature comes from the concept of solar flux. To see what this means, think about taking a ray of sunlight shining on the Earth and decomposing it into two components - one parallel to the Earth's surface and one perpendicular.<br />
<img src="/sites/default/files/post_images/2011-08-07_decomposition.png" width="400" alt="Decomposing a ray of sunlight into two components to find the flux." /><br />
Most of the sunlight going <i>into</i> the ground means this location will be hot, whereas most of the sunlight going <i>along</i> the ground means this location will be cold. The temperature due to direct sunlight is therefore proportional to the cosine of the angle between the ray and the outward normal to the Earth. To actually solve for a temperature, we would have to multiply the intensity of the light by <img class="teximage" src="/sites/default/files/tex/fa6bb826fe04a0cbb1754fe23ecb58c81771f171.png" alt="$ \cos \theta $" /> and integrate this over a region of interest. Our main concern will be solving for <img class="teximage" src="/sites/default/files/tex/fa6bb826fe04a0cbb1754fe23ecb58c81771f171.png" alt="$ \cos \theta $" />, allowing us to express the temperature at one time relative to the temperature at another time without knowing the absolute intensity.</p>
<!--break--><p>To do this, we need a method for keeping track of the Sun's position in the sky. We can do this using the familiar angles of latitude and longitude. This standard geographic co-ordinate system projected onto the sky is called the <i>celestial sphere</i>. If the Sun has a latitude of <img class="teximage" src="/sites/default/files/tex/fc62b6fae9aefdfa8749906c11d44d74d8321644.png" alt="$ 12^{\circ} $" /> and a longitude of <img class="teximage" src="/sites/default/files/tex/5d837e57d27238fedf2cd63da8d6253f68014cfe.png" alt="$ 60^{\circ} $" /> on the celestial sphere, this means that the location on Earth having <img class="teximage" src="/sites/default/files/tex/fc62b6fae9aefdfa8749906c11d44d74d8321644.png" alt="$ 12^{\circ} $" /> latitude and <img class="teximage" src="/sites/default/files/tex/5d837e57d27238fedf2cd63da8d6253f68014cfe.png" alt="$ 60^{\circ} $" /> longitude will see the Sun as being directly overhead. Astronomers would instead say that the Sun has a <img class="teximage" src="/sites/default/files/tex/fc62b6fae9aefdfa8749906c11d44d74d8321644.png" alt="$ 12^{\circ} $" /> declination and a <img class="teximage" src="/sites/default/files/tex/1de9500234009c44068395ed2f09f12104db9328.png" alt="$ 4 \; \textup{hour} $" /> right ascension but for all intents and purposes, this means the same thing.</p>
<p>At any point in time, there is a "sunny spot" - a latitude and longitude on the Earth where the Sun is directly overhead. We will denote this point by <img class="teximage" src="/sites/default/files/tex/ed26e2b89d632d6bd2825bd2fd6e8e6b72acaacb.png" alt="$ (\lambda_{\textup{s}}(t), \phi_{\textup{s}}(t)) $" />. In the following diagram, we see two angles, the sunny latitude and the angle between a direct sun ray and the Earth's axis. This was a very important angle in the last post which we called <img class="teximage" src="/sites/default/files/tex/6b88d9276f78ea7cf0387c17048e4a6c4e1397b4.png" alt="$ \varphi $" />. The formula we had before said that <img class="teximage" src="/sites/default/files/tex/1b3e5eca9fc895b32e308d3f13621476a5c5606c.png" alt="$ \cos \varphi = \cos \left ( \frac{\tau t}{T_1} \right ) \sin \delta $" /> where <img class="teximage" src="/sites/default/files/tex/2665f7b7b546093ced24d951d32a1c788e074414.png" alt="$ T_1 $" /> is 365.24 days, <img class="teximage" src="/sites/default/files/tex/75a2f0f2a39f68921923939128e31280855b514e.png" alt="$ \delta $" /> is the inclination of the Earth's axis and <img class="teximage" src="/sites/default/files/tex/bd9c5c9bbb1cf75d0fee80dd7179fd33336a4bff.png" alt="$ \tau = 2\pi $" />. Together, <img class="teximage" src="/sites/default/files/tex/af5555456af43c37f1a0c15341b4804d4a39ff84.png" alt="$ \lambda_{\textup{s}} $" /> and <img class="teximage" src="/sites/default/files/tex/6b88d9276f78ea7cf0387c17048e4a6c4e1397b4.png" alt="$ \varphi $" /> form a right angle, so the formula we are looking for is <img class="teximage" src="/sites/default/files/tex/92cb73d261373d44e9d0bfc011555e6bd50b8f25.png" alt="$ \sin \lambda_{\textup{s}}(t) = \cos \left ( \frac{t \tau}{T_1} \right ) \sin \delta $" />.<br />
<img src="/sites/default/files/post_images/2011-08-07_sunny_latitude.png" alt="Solving for the sunniest latitude." /><br />
Naturally, the sunny latitude must be somewhere between the Tropic of Cancer and the Tropic of Capricorn. The sunny longitude simply goes from <img class="teximage" src="/sites/default/files/tex/605cd3669e47ee5258124d525008357fb243480a.png" alt="$ 0 $" /> to <img class="teximage" src="/sites/default/files/tex/296bde1c0683ea548f35b48d0655bb27f1c47c2b.png" alt="$ \tau $" /> in the course of a day so a possible formula for it would be <img class="teximage" src="/sites/default/files/tex/508e3c238d03cd57ad3c35d357730dcdc8d7ff70.png" alt="$ \frac{\tau t}{T_2} $" /> where <img class="teximage" src="/sites/default/files/tex/c388a84f7ebead8c651711baf02856fbf43f0482.png" alt="$ T_2 $" /> is 24 hours. This formula however, is naive because after many days, the sunny longitude at the beginning of the day will be affected by the Earth's orbit. The offset this introduces is <img class="teximage" src="/sites/default/files/tex/9ed64ac868dc2885abbb0b2b4be605bd43d5354f.png" alt="$ \frac{\tau t}{T_1} $" />, giving us:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/8600268614d365cc973c54ee62d55bc2846eef28.png" alt="\[ \left ( \lambda_{\textup{s}}(t), \phi_{\textup{s}}(t) \right ) = \left ( \arcsin \left ( \cos \left ( \frac{\tau t}{T_1} \right ) \sin \delta \right ), \tau t \left ( \frac{1}{T_1} + \frac{1}{T_2} \right ) \right ) \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>When I first wrote down this function for tracking the position of the Sun with time, I first asked myself <i>is it injective?</i> It is actually not. In a given year, there are a few moments in time that allow the Sun's position at that time to be revisited once again in the same year. Let's demand that the latitudes be equal at times <img class="teximage" src="/sites/default/files/tex/299772508a3ebf708d7cd501a262b49bb85d2a2b.png" alt="$ t_a $" /> and <img class="teximage" src="/sites/default/files/tex/9fc5cf529b2cb002e2c01a0aeaf4349bafc6840c.png" alt="$ t_b $" />. This gives us:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/06c16d6b352fd91d2986948f6fa68bf2cd736c5c.png" alt="\begin{align*} \cos \left ( \frac{\tau t_a}{T_1} \right ) &= \cos \left ( \frac{\tau t_b}{T_1} \right ) \\<br /> t_a &= t_b \;\; \mathrm{or} \;\; t_a = T_1 - t_b<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Demanding the same thing for the longitudes gives us:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/135265adbba915d102430a14c6223d2e18ee6439.png" alt="\begin{align*} \tau t_a \left ( \frac{1}{T_1} + \frac{1}{T_2} \right ) &= \tau t_b \left ( \frac{1}{T_1} + \frac{1}{T_2} \right ) + k \tau \\<br /> t_a &= t_b + k \tilde{T}<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where <img class="teximage" src="/sites/default/files/tex/15d1095fbff09f0e21c71a2cd7ee89365b8e7bae.png" alt="$ \tilde{T} $" /> is the harmonic mean of <img class="teximage" src="/sites/default/files/tex/2665f7b7b546093ced24d951d32a1c788e074414.png" alt="$ T_1 $" /> and <img class="teximage" src="/sites/default/files/tex/c388a84f7ebead8c651711baf02856fbf43f0482.png" alt="$ T_2 $" /> - approximately 23.9 hours. Putting these together, we see that if the time is a half-integer multiple of <img class="teximage" src="/sites/default/files/tex/15d1095fbff09f0e21c71a2cd7ee89365b8e7bae.png" alt="$ \tilde{T} $" /> <i>before</i> the midpoint of a year, we will get another chance to see the Sun in the same place if we wait for the same half-integer multiple of <img class="teximage" src="/sites/default/files/tex/15d1095fbff09f0e21c71a2cd7ee89365b8e7bae.png" alt="$ \tilde{T} $" /> <i>after</i> the midpoint of that year. One thing I might try to do is use this information to time a visit to the <a href="http://en.wikipedia.org/wiki/Canadian_War_Museum#Memorial_Hall">Canadian War Museum</a> in Ottawa. The window in Memorial Hall is positioned so that it will be illuminated at 11 a.m. on November 11. In order to see it light up at a different time, I could try going at 1 p.m. on February 19. Results will probably vary because how close this hour is to one of the special hours depends heavily on the leap-year cycle.</p>
<p>Before I got distracted, the original purpose of this was to calculate <img class="teximage" src="/sites/default/files/tex/fa6bb826fe04a0cbb1754fe23ecb58c81771f171.png" alt="$ \cos \theta $" /> but we should not keep saying <img class="teximage" src="/sites/default/files/tex/fa6bb826fe04a0cbb1754fe23ecb58c81771f171.png" alt="$ \cos \theta $" /> because the flux is only proportional to this number when we are talking about the <i>light side</i> of the Earth. When the angle <img class="teximage" src="/sites/default/files/tex/f2e88e678a46477e058a65b999c23eaf8a60074d.png" alt="$ \theta $" /> puts us on the <i>dark side</i> of the Earth, the flux is zero. So the function we are <i>really</i> trying to find is not <img class="teximage" src="/sites/default/files/tex/fa6bb826fe04a0cbb1754fe23ecb58c81771f171.png" alt="$ \cos \theta $" /> but <img class="teximage" src="/sites/default/files/tex/b9581cb694b9900268ce160d48146edb67cf4f65.png" alt="$ \max \{ 0, \cos \theta \} $" />. Note that this does not mean that the <i>temperature</i> is zero. The flux of sunlight is only one of the factors determining temperature. The Earth's atmosphere does a good job of ensuring that places still stay relatively warm during the night. If we were on a planet with no atmosphere, nightfall <a href="http://en.wikipedia.org/wiki/Mercury_%28planet%29#Surface_conditions_and_.22atmosphere.22_.28exosphere.29">really would</a> bring about a huge drop in temperature. The next diagram shows a situation in which the Sun is just barely able to deliver a non-zero flux.<br />
<img src="/sites/default/files/post_images/2011-08-07_field_of_view.png" alt="A point on the Earth where the Sun is just barely visible on the horizon." /><br />
For the location in this picture, the Sun has either just risen or is just about to set. If we assume that the Sun is very far away, the Sun will stay in the field of view until we move to the "very top" of the circle or the "very bottom". We will now use this approximation to solve for the flux at noon for an arbitrary latitude.<br />
<img src="/sites/default/files/post_images/2011-08-07_angle_solution.png" width="400" alt="The diagram relating the flux angle to the observer's latitude and the sunny latitude." /><br />
When we solve for the flux at solar noon, we only have to worry about the latitude of the observer. The longitude will be equal to <img class="teximage" src="/sites/default/files/tex/a09bd7fa7c6e2ba008b323b079d55a04bd93d0dd.png" alt="$ \phi_{\textup{s}} $" />. The diagram above makes it clear that <img class="teximage" src="/sites/default/files/tex/19cc9564d890f48e6f1d316c7307b902184e2c60.png" alt="$ \theta = \lambda - \lambda_{\textup{s}} $" />. This relies on the fact that the orange lines are parallel which is only true if we make the approximation that the Sun is infinitely far away. Making substitutions, we see that our flux is:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/0a7b23133a6a027d265267da4bd288edc2858111.png" alt="\begin{align*} \cos \theta(\lambda, \phi_{\textup{s}}, t) &= \cos \left ( \lambda - \lambda_{\textup{s}}(t) \right ) \\<br /> &= \cos \lambda \cos \lambda_{\textup{s}}(t) + \sin \lambda \sin \lambda_{\textup{s}}(t) \\<br /> &= \cos \lambda \sqrt{1 - \cos^2 \left ( \frac{\tau t}{T_1} \right ) \sin^2 \delta} + \sin \lambda \cos \left ( \frac{\tau t}{T_1} \right ) \sin \delta<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>The thing we must do now is plot this function. Again, the plot confirms many intuitive things. The equator achieves the maximum temperature of one on each solstice but any other latitude between the tropics does too. In order to see days in which the warmest temperature might still be zero, we have to go above the Arctic circle or below the Antarctic circle. In other words this is equivalent to finding locations with 24 hour night.<br />
<img src="/sites/default/files/post_images/2011-08-07_temp_plot.png" alt="A plot showing the noon-time temperature as a function of time for various latitudes." /><br />
We solved for <img class="teximage" src="/sites/default/files/tex/7521578cdab6383eec4432d87533b44f26642030.png" alt="$ \theta(\lambda, \phi_{\textup{s}}, t) $" /> but a more general quantity to calculate is <img class="teximage" src="/sites/default/files/tex/718db9c994a2c4f111d1929c477eef0e31515c8d.png" alt="$ \theta(\lambda, \phi, t) $" />. If you have forced yourself to visualize all of the operations up to this point, it is not too hard to see that we start off on the ray of sunlight, rotate through an angle of <img class="teximage" src="/sites/default/files/tex/7521578cdab6383eec4432d87533b44f26642030.png" alt="$ \theta(\lambda, \phi_{\textup{s}}, t) $" /> in one plane and then rotate in an orthogonal plane until we get to a longitude of <img class="teximage" src="/sites/default/files/tex/9c4c905cdc3f98ee64c2a7affff5f991ac6f18f5.png" alt="$ \phi $" />. Recalling the <a href="http://www.smallperturbation.com/sunrise-equation">sunrise post</a>, we did something very similar. We derived a relationship between the angles <img class="teximage" src="/sites/default/files/tex/6b88d9276f78ea7cf0387c17048e4a6c4e1397b4.png" alt="$ \varphi $" />, <img class="teximage" src="/sites/default/files/tex/9c4c905cdc3f98ee64c2a7affff5f991ac6f18f5.png" alt="$ \phi $" /> and <img class="teximage" src="/sites/default/files/tex/d263714e55bdb7502ebaea98b6c35de9e714dd97.png" alt="$ \frac{\tau}{4} - \delta $" />. The exact same relationship holds between <img class="teximage" src="/sites/default/files/tex/718db9c994a2c4f111d1929c477eef0e31515c8d.png" alt="$ \theta(\lambda, \phi, t) $" />, <img class="teximage" src="/sites/default/files/tex/7521578cdab6383eec4432d87533b44f26642030.png" alt="$ \theta(\lambda, \phi_{\textup{s}}, t) $" /> and <img class="teximage" src="/sites/default/files/tex/8d71f418e05960158dbc349389c98d16e78d8a56.png" alt="$ \phi - \phi_{\textup{s}} $" /> - the cosine of the first is equal to the cosine of the second times the cosine of the third. All in all, we have:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/be6f5464f701f0492c35b08600a5b2abae726fe2.png" alt="\begin{align*} \cos \theta(\lambda, \phi, t) &= \cos \left ( \lambda - \lambda_{\textup{s}}(t) \right ) \cos \left ( \phi - \phi_{\textup{s}} \right ) \\<br /> &= \left [ \cos \lambda \sqrt{1 - \cos^2 \left ( \frac{\tau t}{T_1} \right ) \sin^2 \delta} + \sin \lambda \cos \left ( \frac{\tau t}{T_1} \right ) \sin \delta \right ] \cos \left ( \phi - \tau t \left ( \frac{1}{T_1} + \frac{1}{T_2} \right ) \right )<br /> \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>If you want this presented visually, you'll have to do it because I cannot think of a nice way to plot this beast.</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item odd"><a href="/taxonomy/term/19">space</a></div></div></div>Mon, 08 Aug 2011 02:56:20 +0000root16 at https://smallperturbation.comModelling An Epidemic
https://smallperturbation.com/epidemic-modelling
<div class="field field-name-body field-type-text-with-summary field-label-hidden"><div class="field-items"><div class="field-item even"><p>Two and a half years ago, when I read the research interests of my <a href="http://www.mast.queensu.ca/people/profiles/levit.php">statistics prof</a>, I noticed that he had become interested in analyzing epidemiological models. Now, I might finally understand what he was talking about.</p>
<p><img src="/sites/default/files/post_images/2011-07-23_disease_free_equilibrium.png" alt="SIR model reaching the disease free equilibrium." /></p>
<p>If we let S be the population of individuals who are <i>susceptible</i> to the disease, I be the population <i>infected</i> with it and R the population that has <i>recovered</i>, it is not too big a stretch to say that this plot appears to follow the progression of a non-lethal disease. Only a small number of people have the disease at the beginning, but this number grows because the disease is contagious. People who have recovered are immune to further infection meaning that the epidemic eventually dies out.</p>
<!--break--><p>This is called an SIR-model for obvious reasons, and it is one of several <i>compartmental models</i> in mathematical epidemiology, developed by Kermack and McKendrick. The plot was made by numerically solving three coupled differential equations - one for each compartment. These models are of course idealized versions of the truth. Depending on what you read on the subject, you might see papers where the authors just <i>assume</i> that a bunch of differential equations hold. What I will try to do is motivate the assumptions that lead to this model in slightly more elegant terms.</p>
<p>First, we assume that all individuals in a given compartment are identical. This allows our differential equations to be <i>ordinary</i> differential equations. For example, the number of infected individuals only needs to be evaluated at time - <img class="teximage" src="/sites/default/files/tex/8af301beb730459aa3ed201a88b1443ec4fb18ab.png" alt="$ I(t) $" />. Some more complicated models would use <img class="teximage" src="/sites/default/files/tex/43a4bf9ca021dd06d234c353585abeddf6270987.png" alt="$ I(t, x, a) $" /> telling us how many age <img class="teximage" src="/sites/default/files/tex/9b12e95d394689dfb14f256ae2705d57ef7600e2.png" alt="$ a $" /> infected individuals exist at location <img class="teximage" src="/sites/default/files/tex/5608b443fa3a443505c3b765ab16b5f68a5ee1ad.png" alt="$ x $" /> as a function of time.</p>
<p>The second assumption that is usually made is that each compartment has an exponentially distributed holding time. This makes sense because the exponential distribution is <i>memoryless</i>. In this case someone waiting to recover is equally likely to recover at any given time. Things like <a href="http://www.smallperturbation.com/nuclear-energy">radioactive decay</a> have taught me that exponentially distributed holding times naturally lead to rates of change that are proportional to the population size. Nevertheless, I will derive that again here because I will do something more tricky at the end of this post.</p>
<p>The number of individuals that recover in a given time interval is random, so the best we can do is solve for the <i>expected</i> number of recoveries in an interval. Whenever we model a discrete population acting randomly with a differential equation, it makes sense to use the law of large numbers. This gets us as far as:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5566998458fc5e51ee6265949d0f424329c9ef82.png" alt="\begin{align*} \frac{E[\mathrm{Recovery} \; \mathrm{in} (t, t + \Delta t)]}{\Delta t} &= \frac{I}{\Delta t} P(t \leq \mathrm{Recovery} \; \mathrm{time} \leq t + \Delta t \; | \; \mathrm{Recovery} \; \mathrm{time} \geq t) \\ &= \frac{I}{\Delta t} P(\mathrm{Recovery} \; \mathrm{time} \leq \Delta t) \\<br /> &= \frac{I}{\Delta t} F(\Delta t) \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>where we have used memorylessness to simplify the conditional probability. We now need to know what the cumulative distribution function <img class="teximage" src="/sites/default/files/tex/71cab9cb731ae3ded449a0dc522e70096361af02.png" alt="$ F $" /> actually is. A quick trip to Wikipedia will tell you that:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/ced4bc9ae06c75cb0d726c8be19f20a0c3886d95.png" alt="\[ f(x) = \begin{cases} \gamma e^{-\gamma x} & x \geq 0 \\ 0 & x < 0 \end{cases}, \; \; \; F(x) = \begin{cases} 1 - e^{-\gamma x} & x \geq 0 \\ 0 & x < 0 \end{cases} \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>We now just have to substitute this expression and Taylor expand:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/8a282a7449a0f57aecce449e4436e4893625e044.png" alt="\begin{align*} \frac{\textup{d}I}{\textup{d}t} &= -\lim_{\Delta t \rightarrow 0} \frac{E[\mathrm{Recovery} \; \mathrm{in} (t, t + \Delta t)]}{\Delta t} \\<br /> &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} \left ( 1 - e^{-\gamma \Delta t} \right ) \\<br /> &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} \left ( 1 - \left ( 1 - \gamma \Delta t + O \left( \Delta t^2 \right ) \right ) \right ) \\<br /> &= -\lim_{\Delta t \rightarrow 0} \left ( \gamma I + \gamma O \left ( \Delta t \right ) \right ) \\ &= -\gamma I \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>We have now expressed the decay of <img class="teximage" src="/sites/default/files/tex/6b6e8835b025980862a0b136fc83effdf22d346a.png" alt="$ I $" /> in terms of a recovery rate <img class="teximage" src="/sites/default/files/tex/32290fd58e1597dffa8cec1679a87c9719b54279.png" alt="$ \gamma $" />. Can we do the same thing to express the decay of <img class="teximage" src="/sites/default/files/tex/5e5629a35a0da8105df0085e04721871fabab195.png" alt="$ S $" /> in terms of an infection rate <img class="teximage" src="/sites/default/files/tex/2653f9dee796eba1b2cd5a34c570677e8c577b81.png" alt="$ \beta $" />? Not quite. We don't want <img class="teximage" src="/sites/default/files/tex/801c78386458c0f71db5a2c95b761c171b9e797b.png" alt="$ \frac{\textup{d}S}{\textup{d}t} $" /> to equal <img class="teximage" src="/sites/default/files/tex/eba4874ccc38b6abadceadb1b4b85299eb01a332.png" alt="$ -\beta S $" /> because this does not account for the spread of infection. Susceptible individuals get the disease from people who already have it, so instead of using <img class="teximage" src="/sites/default/files/tex/2653f9dee796eba1b2cd5a34c570677e8c577b81.png" alt="$ \beta $" /> as our infection rate, we use <img class="teximage" src="/sites/default/files/tex/88060676d08a613fbc74b5f8669f68b63d6e97fe.png" alt="$ \beta I $" /> thus making the equations non-linear:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/f001c7f5c781a29888a5809c7acdc7c89563685e.png" alt="\begin{align*} \frac{\textup{d}S}{\textup{d}t} &= -\beta I S \\<br /> \frac{\textup{d}I}{\textup{d}t} &= \beta I S - \gamma I \\<br /> \frac{\textup{d}R}{\textup{d}t} &= \gamma I \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>The non-linearity makes the dynamical system hard to solve but it also allows it to reach an equilibrium point. The plot shows what is known as the <i>disease-free equilibrium</i>. This dynamical system is the classic starting point in formulating a whole bunch of epidemiological models. It can be summarized by a block diagram where each arrow going into a compartment contributes positively to its derivative and each arrow coming out contributes negatively.</p>
<p><img src="/sites/default/files/post_images/2011-07-23_sir_model3.png" alt="Standard SIR block diagram." /></p>
<p>One thing that is evident in this system is that we don't have any births or deaths. In other words, we assumed that the time scale for the disease was short. If we incorporate a death rate <img class="teximage" src="/sites/default/files/tex/35dce7ba646baf8740dbfea7bfaf6b09c590e5a3.png" alt="$ \mu $" /> and let the birth rate be equal to it, we can explore how long lived diseases affect a population with zero net growth.</p>
<p><img src="/sites/default/files/post_images/2011-07-23_sir_model2.png" alt="Long term SIR block diagram with zero population growth." /></p>
<p>This changes the three compartments in an asymmetric fashion. The death rate applies equally to S, I and R, but the birth rate only replenishes S. This reflects the fact that individuals start off susceptible because they cannot have the disease when they are born. This bias creates the possibility for an additional equilibrium point called the <i>endemic equilibrium</i>. As shown in the next plot, the system can stabilize in a state where a finite fraction of the population permanently carries the disease.</p>
<p><img src="/sites/default/files/post_images/2011-07-23_endemic_equilibrium.png" alt="Long term SIR model reaching the endemic equilibrium." /></p>
<p>The bifurcation from the disease-free steady-state to the permanent epidemic occurs when a number associated with the system, <img class="teximage" src="/sites/default/files/tex/cfd283b78a9de3c9d3452f4bf4ac93812462541c.png" alt="$ R_0 $" />, becomes greater than 1. The number <img class="teximage" src="/sites/default/files/tex/cfd283b78a9de3c9d3452f4bf4ac93812462541c.png" alt="$ R_0 $" />, called the basic reproduction number, has a different expression for every compartmental model but it represents the number of new infections that a single infection can be expected to cause. In almost every compartmental model, the <img class="teximage" src="/sites/default/files/tex/b74dee773c67b524749e1106865da8ce38aefa2b.png" alt="$ R_0 < 1 $" /> dynamics are much different from the <img class="teximage" src="/sites/default/files/tex/90b2fddea789745d470c39abb9381e7b19533d10.png" alt="$ R_0 > 1 $" /> dynamics.</p>
<p>There are lots of ways in which we can make the model more complicated. We can set the birth rate to <img class="teximage" src="/sites/default/files/tex/ad1e26ae5f7439fb6ab21f7120506cf9e1633e44.png" alt="$ \Theta $" /> - something different from the death rate. When the total population changes, it is more appropriate to make the infection rate <img class="teximage" src="/sites/default/files/tex/135e8bd540f577f150fb14a398d82f7acacf5c12.png" alt="$ \beta I / N $" /> instead of <img class="teximage" src="/sites/default/files/tex/88060676d08a613fbc74b5f8669f68b63d6e97fe.png" alt="$ \beta I $" /> (this is merely a rescaling of our <img class="teximage" src="/sites/default/files/tex/2653f9dee796eba1b2cd5a34c570677e8c577b81.png" alt="$ \beta $" /> in the previous case). We can also consider a disease where the virulence of it induces its own death rate <img class="teximage" src="/sites/default/files/tex/4715ba8bc9a6a827253a0b9f3b814acb881d6bfe.png" alt="$ \nu $" />.</p>
<p><img src="/sites/default/files/post_images/2011-07-23_sir_model1.png" alt="More general SIR model." /></p>
<p>Other common epidemiological models are SEIR (after becoming infected, there is a dormant <i>exposed</i> period during which a person is still unable to transmit the disease), SIS (the disease imparts no immunity and recovered individuals go back to being susceptible), SIRS (temporary immunity) and SIRC (instead of recovering, some people may become asymptomatic carriers that still infect others). In the literature, people have investigated the effects of <a href="http://www.ksiam.org/conference/annual072/upfile/Optimal%20SIR.pdf">vaccination</a>, <a href="http://www.ncbi.nlm.nih.gov/pubmed/17924718">quarantining</a> and <a href="http://www.utdallas.edu/~bxt043000/Publications/Conference-Papers/DM/C185_Simulating_bioterrorism_through_epidemiology_approximation.pdf">biological warfare</a>.</p>
<p>The last thing I want to do is choose a different distribution for the holding time. As we've described it so far, an individual moves between compartments by waiting for an exponentially distributed event to occur. I will keep it fairly simple here and have them wait for <i>two</i> exponentially distributed events. This means their waiting times have a gamma distribution with a shape parameter of 2 whose probability density and cumulative distribution are:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/4c518a465303d5a3bbf516d02d886c30db75d58b.png" alt="\[ g(x) = \begin{cases} x \gamma^2 e^{-\gamma x} & x \geq 0 \\ 0 & x < 0 \end{cases}, \; \; \; G(x) = \begin{cases} 1 - e^{-\gamma x} (\gamma x + 1) & x \geq 0 \\ 0 & x < 0 \end{cases} \]" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>Since this distribution is not memoryless, we must evaluate the following conditional probability:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/5c38fdf8d9ac102df94171f79ac3af004b286892.png" alt="\begin{align*} P(t \leq \mathrm{Recovery} \; \mathrm{time} \leq t + \Delta t \; | \; \mathrm{Recovery} \; \mathrm{time} \geq t) &= \frac{P(t \leq \mathrm{Recovery} \; \mathrm{time} \leq t + \Delta t, \; \mathrm{Recovery} \; \mathrm{time} \geq t)}{P(\mathrm{Recovery} \; \mathrm{time} \geq t)} \\<br /> &= \frac{P(t \leq \mathrm{Recovery} \; \mathrm{time} \leq t + \Delta t)}{1 - P(\mathrm{Recovery} \; \mathrm{time} \leq t)} \\ &= \frac{G(t + \Delta t) - G(t)}{1 - G(t)} \\ &= \frac{e^{-\gamma t}(\gamma t + 1) - e^{-\gamma t - \gamma \Delta t}(\gamma (t + \Delta t) + 1)}{e^{-\gamma t}(\gamma t + 1)} \\<br /> &= \frac{(\gamma t + 1) - e^{-\gamma \Delta t}(\gamma (t + \Delta t) + 1)}{\gamma t + 1} \\<br /> &= 1 - e^{-\gamma \Delta t} \left ( \frac{\gamma t + \gamma \Delta t + 1}{\gamma t + 1} \right ) \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>With this out of the way, we are ready to find the differential equation for this process using the same method as before.</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/6897418890f220b6639c700bfdaa8a8e8029ded1.png" alt="\begin{align*} \frac{\textup{d}I}{\textup{d}t} &= -\lim_{\Delta t \rightarrow 0} \frac{E[\mathrm{Recovery} \; \mathrm{in} (t, t + \Delta t)]}{\Delta t} \\ &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} P(t \leq \mathrm{Recovery} \; \mathrm{time} \leq t + \Delta t \; | \; \mathrm{Recovery} \; \mathrm{time} \geq t) \\<br /> &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} \left ( 1 - e^{-\gamma \Delta t} \left ( \frac{\gamma t + \gamma \Delta t + 1}{\gamma t + 1} \right ) \right ) \\<br /> &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} \left ( 1 - \left ( 1 - \gamma \Delta t + O(\Delta t^2) \right ) \left ( \frac{\gamma t + \gamma \Delta t + 1}{\gamma t + 1} \right ) \right ) \\<br /> &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} \left ( 1 - \frac{\gamma t + \gamma \Delta t + 1}{\gamma t + 1} + \gamma \Delta t \frac{\gamma t + \gamma \Delta t + 1}{\gamma t + 1} + O(\Delta t^2) \right ) \\<br /> &= -\lim_{\Delta t \rightarrow 0} \frac{I}{\Delta t} \left ( \frac{\gamma^2 \Delta t (t + \Delta t)}{\gamma t + 1} \right ) \\ &= -I \frac{\gamma^2 t}{\gamma t + 1} \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>It now seems clear to me how the differential equations must change if we wish to accommodate this new distribution:</p>
<table class="displaymath">
<tr>
<td class="dspleft"><img class="teximage" src="/sites/default/files/tex/4b45844c582b2edf089b8c03cd4d0f1c4f06173b.png" alt="\begin{align*} \frac{\textup{d}S}{\textup{d}t} &= -\frac{\beta^2 t}{\beta t + 1} \frac{S I}{N} \\<br /> \frac{\textup{d}I}{\textup{d}t} &= \frac{\beta^2 t}{\beta t + 1} \frac{S I}{N} - \frac{\gamma^2 t}{\gamma t + 1} I \\<br /> \frac{\textup{d}R}{\textup{d}t} &= \frac{\gamma^2 t}{\gamma t + 1} I \end{align*}" /></td>
<td class="dspright"></td>
</tr>
</table>
<p>I have not looked hard enough to actually find these equations in print, but it would be interesting to use them and see how things change.</p>
</div></div></div><div class="field field-name-taxonomy-vocabulary-2 field-type-taxonomy-term-reference field-label-above"><div class="field-label">Tags: </div><div class="field-items"><div class="field-item even"><a href="/taxonomy/term/15">fun calculations</a></div><div class="field-item odd"><a href="/taxonomy/term/9">public health</a></div></div></div>Sat, 23 Jul 2011 06:21:22 +0000root9 at https://smallperturbation.com