# De waarschijnlijkste Mol van 2021

De populaire tv-reeks “De Mol” is ondertussen bezig aan zijn negende seizoen in België. Ieder seizoen start met 10 deelnemers. Indien er in iedere aflevering zonder meer één persoon zou afvallen, is de kans dat iemand van de overblijvers de mol is, indien je geen extra info hebt, 1/(10 – n) na n afleveringen en zou deze blogpost nooit hebben bestaan.

Vermoedelijk om het spel wat interessanter te maken, zijn er vaak “speciallekes” (zoals vrijstellingen) die de kans dat iemand van de overblijvers de mol is (soms sterk) beïnvloeden. Deze blogpost berekent de theoretische kans dat iemand de mol is na n afleveringen, indien er geen extra info voorhanden is. Er wordt bijvoorbeeld geen rekening gehouden dat de mol goed in kansrekening zou kunnen zijn en zijn eigen kansen zou willen beïnvloeden, mocht hij dit kunnen. Ook wordt er verondersteld dat iedere andere kandidaat “even sterk” is, wat misschien in het begin nog wel zo zal zijn (bij benadering), maar naar het einde toe niet meer (sommige kandidaten zullen veel beter op het spoor zitten van de mol en hebben dus een kleinere kans dat ze het spel moeten verlaten). Dit gezegd zijnde, laten we maar eens kijken wat deze “mogelijks naïeve” aanpak oplevert 🙂

Deze pagina zal worden bijgewerkt naarmate het programma vordert. Bovenstaande figuur toont de kansen tot nu toe, die in volgende paragrafen gedetailleerd worden toegelicht.

## Regel van Bayes

Wat betreft de formules voor kansrekenen hebben we genoeg aan de regel van Bayes. Beschouw de discrete toevalsgrootheid M die de kans voorstelt dat iemand de mol is. Stel alle gebeurtenissen die plaatsvinden in de aflevering (en waar we rekening mee houden) voor als E. Dan is de kans dat iemand de mol is op het einde van de aflevering gegeven door:

$P[M|E]=\displaystyle\frac{P[E|M]\cdot P[M]}{P[E]}$,

m.a.w. de a-posteriori kans P[M|E] dat iemand de mol is na de gebeurtenissen in de aflevering is gelijk aan de a-priorikans dat iemand aan het begin van de aflevering de mol is, gegeven door P[M], vermenigvuldigd met de kans dat de gebeurtenissen optreden, gegeven dat iemand de mol is (t.t.z. P[E|M]), gedeeld door de kans dat de gebeurtenissen optreden P[E].

Om de notatie wat te verlichten schrijven we bijvoorbeeld P[Noah] voor P[M = Noah]. Ook laten we vaak gewoon “|E” weg als het duidelijk om een a posteriori kans aan het einde van de aflevering gaat, dus P[Noah] is dan eigenlijk P[M = Noah | E].

## Aflevering 1

Er strijden twee groepjes van 10 personen tegen elkaar, met elk één mol in! Indien de mol uit het groepje van de aanvallers in de auto kan geraken, wat het groepje verdedigers probeert de verhinderen, neemt die de plaats in van één van de verdedigers, waardoor er slechts 9 verdedigers doorgaan. Indien de aanvaller mol is, valt de mol uit het groepje van de verdedigers af, zodat er slechts één mol overblijft.

De aanvaller Noah slaagt erin om in de auto te geraken. De kans dat hij mol is, is 1/10. De kans dat iemand van de 9 overblijvende verdedigers (Annelotte, Dami, Jasmien, Katrien, Kevin, Lennart, Philip, Samina en Sven) mol is, is dus ook 1/10.

Op de uiteindelijke (tweede) eliminatie valt niemand af, maar er worden wel groene schermen getoond voor Annelotte, Jasmien, Katrien, Philip, Samina en Sven (en geen scherm voor de vier andere).

Hierdoor wordt P[E | M = Annelotte] evenredig met 4/10 (redenering: als Annelotte de mol is, dan kan iedere deelnemer uit het groepje van 4 het rood scherm hebben), en is bijvoorbeeld P[E | M = Dami] evenredig met 3/10 (redenering: als Dami de mol is, dan kunnen slechts 3 van de overige 4 deelnemers het rode scherm hebben, gezien Dami noodzakelijkerwijze een groen scherm heeft).

Intuïtieve verklaring: er is één niet-getoond scherm rood, waardoor de kans dat een bepaalde persoon waarvoor het scherm niet werd getoond de mol niet is, groter is dan voor iemand met een groen scherm.

Voor de mensen die nog steeds niet overtuigd zijn: beeld je in dat er slechts één persoon zou zijn waarvan het scherm niet wordt getoond, en de rest heeft een (getoond) groen scherm, dan is zijn scherm gegarandeerd rood en weet je dat hij de mol zeker niet is. Ook voor meerdere personen kun je dus uitspraken doen, maar je kan niet meer met 100% zekerheid zeggen dat iemand de mol niet is.

Dit leidt tot volgende a-posteriorikansen na aflevering 1:

P[Annelotte] = P[Jasmien] = P[Katrien] = P[Philip] = P[Samina] = P[Sven] = 1/9 = 11,1%.
P[Dami] = P[Kevin] = P[Lennart] = P[Noah] = 1/12 = 8,3%.

## Aflevering 2

De 10 overblijvers worden opgedeeld in twee groepjes van 5, waarbij slechts één groepje (bestaande uit Dami, Kevin, Noah, Philip en Sven) moet deelnemen aan de eliminatie. Er vallen bovendien 2 mensen uit dit groepje af, met name Dami en Kevin. De waarschijnlijkheid dat deze situatie optreedt indien één van de mensen in het groepje van 3 overblijvers de mol is, is gelijk aan 1/4 x 1/3 x 2 = 1/6 (slechts de combinatie (Dami, Kevin) uit 6 mogelijke koppels, gezien we de mol niet mogen kiezen). De kans dat deze situatie optreedt indien een persoon uit het groepje van 5 overblijvers (die niet deelnamen aan de eliminatie) de mol is, is gegeven door 1/5 x 1/4 x 2 = 1/10 (slechts de combinatie (Dami, Kevin), maar nu zijn er tien mogelijke koppels, gezien de mol er niet bij zit).

Dit is een variant op het fameuze driedeurenprobleem. Intuïtieve verklaring: het is “verdacht” dat die drie overblijvers er nog in zitten, zeker omdat het andere groepje geen eliminatie hoefde te doen (ze hebben zich dus “extra” bewezen).

Rekening houdend met de a-priorikansen (die gelijk zijn aan de a-posteriorikansen op einde van vorige aflevering) en met de regel van Bayes, zoals hierboven uitgelegd, krijgen we:

P[Philip] = P[Sven] = 17,9%.
P[Noah] = 13,4%.
P[Annelotte] = P[Jasmien] = P[Katrien] = P[Samina] = 10,7%.
P[Lennart] = 8,0%.

De kansen zijn het grootst voor Philip en Sven, gezien zij in aflevering 1 een getoond groen scherm hadden en in het groepje met 3 overblijvers zitten. De kans is het laagst voor Lennart, gezien hij de enige is uit het groepje die de eliminatie niet moest doen waarvan het scherm niet werd getoond in aflevering 1.

## Aflevering 3

In aflevering 3 zijn er twee mensen (namelijk Jasmien en Katrien) met één pasvraag. Bij hun score op 20 wordt m.a.w. automatisch 1 punt bijgeteld (tenzij ze al 20 hebben), m.a.w. één slecht antwoord verandert bij hen in een goed antwoord. Het zou een gemiste kans zijn mochten we het effect van pasvragen niet in rekening proberen te brengen, maar het is geen eenvoudig probleem waarbij we een aantal (hopelijk gerechtvaardigde) veronderstellingen zullen moeten doen. Zo veronderstellen we dat iedere speler de vragen willekeurig invult en dat er per vraag 4 antwoordmogelijkheden zijn. De totale score voor iedere speler (vóór pasvraagcorrectie) is op deze manier binomiaal verdeeld. Om de gebeurteniswaarschijnlijkheid P[E|M] in aflevering 3 te berekenen blijkt dat we genoeg hebben aan de kans dat iemand zonder pasvragen een lagere score heeft dan iemand met een pasvraag. Bij een gelijke score geldt de totale benodigde oplossingstijd als scheidingspunt (lager is beter).

Deze kans blijkt met de veronderstellingen hierboven gelijk te zijn aan 64,1%, m.a.w. bij een rechtstreeks duel tussen iemand met een pasvraag en iemand zonder pasvraag, heeft diegene zonder de pasvraag een kans van 64,1% om eruit te liggen!

Terug naar de eliminatie, waarbij Noah afvalt. De kans dat deze situatie gebeurt indien bijvoorbeeld Annelotte de mol is (of iemand anders zonder pasvraag) is gelijk aan P[Noah valt af | Annelotte is de mol] = P[Noah heeft slechtste score | Annelotte is de mol] = P[Noah heeft slechtere score dan Jasmien] x P[Noah heeft slechtere score dan Katrien] x … x P[Noah heeft slechtere score dan Sven] = 0,641^2 * 0,5^4. Indien Jasmien (of Katrien) de mol zou zijn is de kans echter gelijk aan 0,641 * 0,5^5.

Dit levert de volgende a-posteriorikansen na aflevering 3:

P[Philip] = P[Sven] = 21,8%.
P[Annelotte] = P[Samina] = 13,1%.
P[Jasmien] = P[Katrien] = 10,2%.
P[Lennart] = 9,8%.

## Aflevering 4

Annelotte bemachtigt 5 pasvragen, Sven krijgt 1 pasvraag en Katrien valt af. P[iemand zonder pasvragen heeft slechtere score dan iemand met 5 pasvragen] = 96,5%. Hierdoor wordt P[Katrien valt af | Annelotte is de mol] = P[Katrien heeft slechtere score dan Sven] x … = 0,641 x 0,5^4, P[Katrien valt af | Sven is de mol] = 0,965 x 0,5^4 en P[Katrien valt af | Jasmien is de mol] = 0,965 x 0,641 x 0,5^3.

Dit levert de volgende a-posteriorikansen na aflevering 4:

P[Philip] = 27,7%.
P[Sven] = 21,6%.
P[Samina] = 16,6%.
P[Jasmien] = 13%.
P[Lennart] = 12,5%.
P[Annelotte] = 8,6%.

## Aflevering 5

Samina valt af.

Dit levert de volgende a-posteriorikansen na aflevering 5:

P[Philip] = 33,2%.
P[Sven] = 25,9%.
P[Jasmien] = 15,6%.
P[Lennart] = 15%.
P[Annelotte] = 10,3%.

## Aflevering 6

Lennart heeft een vrijstelling en Jasmien valt af. P[Jasmien valt af | Philip is de mol] = 0,5^2, terwijl P[Jasmien valt af | Lennart is de mol] = 0,5^3.

Dit levert de volgende a-posteriorikansen na aflevering 6:

P[Philip] = 43,2%.
P[Sven] = 33,7%.
P[Annelotte] = 13,4%.
P[Lennart] = 9,7%.

## Aflevering 7

Iedereen heeft een pasvraag, behalve Philip die afvalt.

Dit levert de volgende a-posteriorikansen na aflevering 6:

P[Sven] = 59,3%.
P[Annelotte] = 23,6%.
P[Lennart] = 17,1%.

## Bijlage

De code die alle berekeningen doet en bovenstaande figuur genereert kan je hier vinden. Laat alstublieft iets weten als je ergens een fout vindt 🙂

# Neural networks in computer chess

## Introduction

As of December 2020, the top computer chess programs in the world use a neural network for position evaluation. Let’s take a look at two such open source engines: Leela Chess Zero and Stockfish.

## Leela Chess Zero

Starting in January 2018, the initial goal of the open source project Leela Chess Zero was to replicate Deepmind’s revolutionary engine AlphaZero. For more information, see https://en.wikipedia.org/wiki/Leela_Chess_Zero.

Leela Chess Zero’s success can be attributed to a combination of MCTS (Monte Carlo Tree Search) in combination with a relatively large neural network that is used for both evaluation and search (one of the network outputs is a “policy head” that ranks legal moves in order of importance to guide search).

The latest generation of neural network architecture is based on the AlphaZero architecture [Silver et al. 2017] and consists of a deep convolutional ResNet [He et al. 2015] structure at its core, followed by both “policy” and “value” heads, that represent a relative move importance and position evaluation, respectively. Notable additions w.r.t. the AlphaZero architecture are the Squeeze-and-Excitation blocks [Hu et al. 2017] in each residual layer, along with an additional “moves left head”.

## Stockfish

Since version 12, the open source project Stockfish makes use of a neural network for most of the position evaluation.

The neural network architecture, denoted NNUE (efficiently updatable neural network), is extremely original and originates from seminal work in [Nasu 2018].

As of December 2020, this neural network is used exclusively for static position evaluation, where it replaces a hand-crafted evaluation function that has taken tens of years of fine-tuning, obliterating it in the process (around 100 elo better).

## References

[He et al. 2015] Deep Residual Learning for Image Recognition
[Hu et al. 2017] Squeeze-and-Excitation Networks
[Nasu 2008] ƎUИИ Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi
[Silver et al. 2017] Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

# Perl Weekly Challenge 083

Lately I got intrigued by the beautiful hidden depths (maybe some people prefer “atrocious inner guts”) of Perl and I decided to take up the Perl Weekly Challenge for fun (I “joined the club” with challenge 082 last week).

This week, the second task is equivalent to the following optimization problem:

Problem Definition

Given a set of natural numbers $S$, find a partition $(S_1,S_2)$ (i.e., disjoint subsets $S_1,S_2$ such that $S_1\cup S_2 = S$) such that the absolute value of the difference between the sums of numbers in each subset, denoted $|\sum S_1 - \sum S_2|$, is minimal across all possible partitions.

Complexity

[Note: complexity in this document is always w.r.t. the number of elements of the input set $S$.]

Unfortunately, this optimization problem (let’s denote it as $A$) is NP-hard, which means (loosely speaking) that it is at least as hard as every problem in NP or, more formally, that every problem in NP can be reduced in polynomial time to it.

To prove this statement, it suffices to show there is an NP-complete problem that can be reduced in polynomial time to $A$, as an NP-complete problem can be considered (loosely speaking) as a representative member of the set NP (in the sense that every other problem in NP can be transformed in polynomial time to it).

An example NP-complete problem of this kind is its “easier” brother $B$ the “two-way partitioning problem”, which tries to decide if there is a partition such that $\sum S_1=\sum S_2$ (and if there is one, return it!). It can indeed be shown that $B$ can be reduced to $A$ in polynomial time, as $A$ yields the solution to which we need only polynomial time processing (linear even) to lead to a solution of $B$ (i.e. if $|\sum S_1 - \sum S_2|=0$, then $S_1,S_2$ is a solution of $B$ and if not, no solution of $B$ exists; note that summing is linear).

Now, on to the question why $B$ is NP-complete. It is clear that it lies in NP because a solution, if it exists, can be verified to be correct in polynomial time (linear actually, all you need is summing) on a deterministic Turing machine. Moreover, it can be reduced in polynomial time to the so-called subset sum problem, as argued on https://en.wikipedia.org/wiki/Subset_sum_problem, by noting that it is equivalent to the subset sum problem over $\{S,-W\}$, with $W=\sum S/2$. This subset sum problem is known to be NP-complete.

This concludes the discussion that $B$ is NP-complete and, hence, $A$ is NP-hard.

Implementation

The above discussion unfortunately implies that no algorithm in polynomial time (on a deterministic Turing machine of course) to solve $A$ is currently known. If we would be able to find such an algorithm, we would have resolved a long standing question by proving that P = NP.

My naive implementation has exponential time complexity and can be found here.

Note that heuristic algorithms exist, see e.g. https://www.ijcai.org/Proceedings/09/Papers/096.pdf, to reduce the average running time considerably. However, all of them have worst case exponential time complexity.

# White has the edge in random chess

We show that white is more likely to win the game than black in chess where both sides pick moves at random. The statistical significance after playing 1,000,000,000 games is higher than five sigma, which is the threshold for CERN to announce the discovery of a new particle, so it is also good enough for us.

## Rules of the game

• Each side plays moves at random, uniformly (with equal probability) over the set of legal moves.
• The game ends if one of the following conditions is met:
• Checkmate.
• Stalemate.
• Draw by fifty-move rule if during the last 100 halfmoves (plies) no pawn movements or piece captures have happened.

Note that we don’t end a game based on the threefold repetition rule, most importantly because the library that we are using to generate the games doesn’t offer this functionality (yet). However, we believe that the presented main conclusion (that white has the edge) still holds in this case, mainly because this edge can be completely attributed to a higher probability during the first 15 moves (see further), where the probability of threefold repetitions is likely to be very small.

## Computer simulations

The game generation is written in Java and makes use of the library jchess. Games were generated on quad-core Intel Core i5-5300U at a rate of about 25 million games per hour. Hence, about 40 hours of computing time were necessary to generate 1 billion games.

For each chess game, we used a new instance of the random number generator java.util.Random, which uses a 48-bit linear congruential pseudorandom number generator internally and is thread-safe in the sense that constructor invocations from different threads lead to uniquified seeds.

The processing of the results was performed in Python.

## Results

Here are the probabilities of a white win, black win, stalemate or draw by fifty-move rule, along with a 95% confidence interval. For a detailed discussion of the relevant formulas, we refer to the next section. Note that the number of games for each case can be easily obtained by dropping the decimal point (all digits are significant), as the total number of games is exactly equal to 1 billion (e.g. there are 76,660,772 games where white wins).

Pr[white wins]      =  7.6660772% ± 0.0016490%
Pr[black wins]      =  7.6597606% ± 0.0016484%
Pr[stalemate]       =  6.3434934% ± 0.0015107%
Pr[fifty-move rule] = 78.3306688% ± 0.0025536%

Note that the estimated win probability for white is 0.0063166% higher that that for black. In the next section we show that this apparent advantage for white is indeed statistically significant up to more than five sigma. This difference can be completely attributed to the statistics of the set of games that take up to 15 moves, where white is expected to win 0.2048859% of them, and black 0.1987829% = 0.2048859% – 0.0061030%. This behavior is also clear from the following histograms.

## Discussion

In this section, we detail all steps that lead to the conclusion that white has the edge in random chess (up to more than five sigma statistical significance).

Denote the total number of games played as $\displaystyle N$ (equal to 1 billion in our experiment) and the number of games that white won as $N_w$ (equal to 76,660,772 in our experiment). Denote the true probability that white wins as $p_w$. It is easy to show that the following estimator of $p_w$ is unbiased, i.e., that $E[\widehat{p_w}]=p_w$.

$\displaystyle\widehat{p_w}=\frac{N_w}{N}$

As all generated games are i.i.d. (statistically independent identically distributed) samples of a random chess game, $N_w$ is binomially distributed with p.m.f. (probability mass function) given by

$\displaystyle p(N_w)=\frac{N!}{(N-N_w)!N_w!}p_w^{N_w}(1-p_w)^{N-N_w}.$

As both $N_w$ and $N$ are very large, a normal approximation is justified, treating $N_w$ as a continuous variable and yielding the p.d.f. (probability density function)

$\displaystyle p(N_w)=\frac{1}{\sqrt{2\pi}\sigma_{N_w}}\exp\left(-\frac{(N_w-Np_w)^2}{2\sigma_{N_w}^2}\right),$

with $\sigma_{N_w}^2=Np_w(1-p_w)$.

This implies that the estimator $\widehat{p_w}$ is also normally distributed with mean $p_w$ and variance $\sigma_{\widehat{p_w}}^2=p_w(1-p_w)/N$, i.e.,

$\displaystyle p(\widehat{p_w}|p_w)=\frac{1}{\sqrt{2\pi}\sigma_{\widehat{p_w}}}\exp\left(-\frac{(\widehat{p_w}-p_w)^2}{2\sigma_{\widehat{p_w}}^2}\right).$

Note that, although the variance $\sigma_{\widehat{p_w}}^2$ depends on $p_w$, for the present case of small deviations around $\widehat{p_w}$, it holds that $\widehat{p_w}\approx p_w$ and hence also $\sigma_{\widehat{p_w}}^2\approx \sigma_w^2\triangleq \widehat{p_w}(1-\widehat{p_w})/N$. This yields the following expression for the likelihood function of $p_w$, given $\widehat{p_w}$:

$\displaystyle p(p_w|\widehat{p_w})=\frac{1}{\sqrt{2\pi}\sigma_w}\exp\left(-\frac{(\widehat{p_w}-p_w)^2}{2\sigma_w^2}\right).$

This explains the earlier given 95% confidence interval for $p_w$, namely $\displaystyle\widehat{p_w} \pm 1.96\sigma_w=\frac{N_w}{N} \pm \frac{1.96}{N}\sqrt{\frac{N_w(N-N_w)}{N}}=7.6660772\% \pm 0.0016490\%.$

Completely similar, we can show that the likelihood function of $p_b$ (the true probability that black wins),given $\widehat{p_b}=N_b/N$ (with $N_b$ the number of games that black won), is given by

$\displaystyle p(p_b|\widehat{p_b})=\frac{1}{\sqrt{2\pi}\sigma_b}\exp\left(-\frac{(\widehat{p_b}-p_b)^2}{2\sigma_b^2}\right),$

with $\sigma_b^2\triangleq \widehat{p_b}(1-\widehat{p_b})/N$. This means that the likelihood function of $p_w-p_b$ is normally distributed with mean $\widehat{p_w}-\widehat{p_b}$ and variance $\sigma^2=\sigma_w^2+\sigma_b^2$. Finally, this implies that the probability that $p_w$ is higher than $p_b$ is given by

$\displaystyle \text{Pr}[p_w>p_b]=\text{Pr}[p_w-p_b>0]=\frac{1}{2}+\frac{1}{2}\text{erf}\left(\frac{\widehat{p_w}-\widehat{p_b}}{\sqrt{2}\sigma}\right)$.

Substituting the values of $N_w$, $N_b$ and $N$ obtained in the experiment into the equations eventually yields

$\text{Pr}[p_w>p_b\ |\ \widehat{p_w},\ \widehat{p_b}]=99.9999945\%$,

i.e., from the observed experiment we can conclude that the likelihood that white has the edge is $99.9999945\%$ (or, equivalently, the likelihood that we take the wrong conclusion and actually white doesn’t have the edge is smaller than 1 in 18 million). This corresponds to a confidence level of $\widehat{p_w}-\widehat{p_b}=5.3\sigma$.

## Conclusion

After playing 1 billion random chess games (where each side moves at random), we can conclude that white has the edge with a likelihood of 99.9999945% (5.3 sigma) when both sides play random moves. The higher winning odds can be completely attributed to the behavior during the first fifteen moves, after which white’s advantage evaporates and there’s no noticeable difference between the two sides anymore.

# Falling magnet in a conductive tube

Recently I got captivated by the phenomenom of a falling magnet that is decelerated by its movement through a conductive tube, as shown in the following video.

Sure enough, the principle is easy to understand, but I was interested in the details of the problem, i.e. how does the movement look like given all relevant parameters (tube conductivity, thickness, velocity of the magnet…).

It turns out there is an interesting (but rather concise) article on Wikipedia about magnetic damping. It is stated that the force on the magnet depends linearly on its velocity $v$, i.e. $F=Kv$, with $K$ a constant so-called damping coefficient. In this post, I would like to comment on this and demonstrate that it is only an approximation of reality.

Consider a magnet moving along the $z$-axis with constant velocity $v$. This $z$-axis is at the same time the axis of a circular tube of a certain thickness (which we assume sufficiently small). Denote the flux of the magnetic induction, generated by the magnet, through the interior cross section of the tube at coordinate $z$ and time $t$ as $f_m(z-vt)$ (note that we make the reasonable assumption that $v$ is much smaller than the speed of light $c$, i.e. we consider a quasi-static variant of Maxwell’s equations, which is well justified and facilitates the analysis considerably).

The moving magnet will induce circular currents in the tube. It is easy to see that this current density will also behave “wave-like” as $i(z-vt)$. This current on its own generates a magnetic induction (which excerts a force on the magnet). Denoting the flux of the magnetic induction, generated by a unit current loop at $z=0$, through the interior cross section of the tube at $z$ as $f_i(z)$, the flux of the magnetic induction generated by the current density $i(z-vt)$ is given by the convolution integral (again under the earlier mentioned quasi-static assumption)

$f_I(z-vt)= \displaystyle\int f_i(z-z')i(z'-vt)dz'$

Now we apply Faraday’s law on the total flux $f_m+f_I$, which yields, with $R$ the resistance per unit length of the conductive tube (for the circular currents):

$\displaystyle f_m^\prime(z-vt)+\int f_i^\prime(z-z')i(z'-vt)dz'=-\frac{R}{v}i(z-vt)$,

where the primes on functions indicate derivatives. Going to the Fourier domain (with capital letters for the Fourier transform), the convolution becomes a multiplication, eventually yielding

$\displaystyle I(\omega) = \frac{-jv\omega F_m}{jv\omega F_i+R}$.

As the force $F$ on the magnet is linearly depending on the current density $i(z-vt)$, it is apparent that linearity with $v$ is only valid for sufficiently small values of $v/R$ (w.r.t. $|\omega F_i|$, for all $\omega\in\mathbb{R}$).

Conclusion

We showed that the force $F$ acting on a magnet moving with velocity $v$ through a conductive tube is only approximately linearly proportional to $v$. Only if the velocity is low enough, the magnetic induction of the induced currents is sufficiently small such that its influence on the induced e.m.f. is small compared to the effect of the magnetic induction of the magnet (i.e. drop the term with $f_i$ in the equations above), and linearity is then approximately valid.

# Derivation of the classical lift and drag formulas of an airplane wing

Introduction

In a book about the principles of aircraft flight, the following formula for the lifting force per unit length $l$, acting on an airplane wing with chord length $c$ which moves at a relative speed $V$ in a fluid with mass density $\rho$, was stated without proof.

$l=\frac{1}{2}C_L\rho V^2c\qquad\qquad\qquad(1),$

with $C_L$ the so called lift coefficient, a dimensionless number that only depends on the shape of the airplane wing and the angle of attack of the wing w.r.t. the fluid flow. In this post, an elegant derivation of this formula is given (not first the first time of course, see also the last section with some interesting links on the web), together with a sufficient set of conditions that ensure the formula is valid. From the derivation, it can be immediately concluded that the drag force per unit length has the same form.

Governing equations

The starting point of our analysis are the Euler equations in an incompressible, inviscid fluid with steady flow. These equations impose a connection between the components of the velocity vector field $\mathbf{v}(x,y)$ and the pressure field $p(x,y)$ and are given in differential form by:

$\rho \mathbf{v}\cdot\nabla\mathbf{v} = -\nabla p,$
$\nabla\cdot\mathbf{v}=0.$

Supplemented with the appropriate boundary conditions, the system of equations yields a unique solution for each value of the parameter vector $(\rho,\ V,\ c)$ if the angle of attack and the shape of the wing (and also the static air pressure $P_0$) are given. For completeness, let’s state the boundary conditions.

$\mathbf{v}\cdot\mathbf{n}=0\quad \mbox{on the wing contour},$
$\lim_{|x|+|y|\rightarrow\infty}\mathbf{v}=V\mathbf{u}_x,$
$\lim_{|x|+|y|\rightarrow\infty}p=P_0-\frac{1}{2}\rho V^2.$

The system of non-linear equations seems quite difficult to solve. In fact, for general shapes only numerical algorithms can solve the equations. The FoilSim III software program is an example where the previously given Euler equations (with the same assumptions of incompressibility and inviscosity of the fluid, as indicated here) are numerically solved.

Once we have the solution, the lift per unit length is given by the following closed path integral over the wing contour $\mathcal{C}$ in contourclockwise direction.

$l=\oint_{\mathcal{C}_-}p\ dx\qquad\qquad\qquad(2).$

Likewise, the drag per unit length $d$ is given by

$p=\oint_{\mathcal{C}_+}p\ dy.$

If it is impossible to obtain analytical solutions of the Euler equations for a generally shaped wing, how could we possibly derive expression (1)? As it turns out, we don’t have to solve them really, which is quite remarkable!

Dimensional analysis

Let’s focus on the lift per unit length $l$, as given in formula (2). For a fixed wing shape and angle of attack (and static pressure $P_0$), the solution of the Euler equations with the stated boundary conditions will only depend on $\rho$, $V$ and $c$, i.e. both $p(x,y)$ and $\mathbf{v}(x,y)$ depend only on these three independent variables. Likewise, the same holds for $l$, i.e. there exists a function $f$ such that

$l=f(\rho,V,c).$

Or, put another way, a function $\widetilde{f}$ exists such that

$\widetilde{f}(l,\rho,V,c)=0.$

The key point is to now employ the Buckingham $\pi$ theorem, which leads to the following so called dimensional matrix $M$, where the columns represent the four variables $l,\ \rho,\ V$ and $c$ and the rows represent the three fundamental dimensions mass, length and time.

$M=\begin{pmatrix}1&1&0&0\\ 0&-3&1&1\\-2&0&-1&0\end{pmatrix}.$

It can be seen that the null space of $M$ is spanned by the vector $\begin{pmatrix}1&-1&-2&-1\end{pmatrix}^T$. The Buckingham $\pi$ theorem now states that there is only one so called $pi$-group $\pi_1=l\rho^{-1}V^{-2}c^{-1}$. Moreover, the theorem implies that there exists a function $g$ such that

$\widetilde{f}(l,\rho,V,c)=0\iff g(\pi_1)=0.$

Due the uniqueness of the solution, $g$ will only have one zero, let’s call it $\alpha$ which is naturally independent of all the four considered variables. The let’s us conclude that

$\pi_1=\displaystyle\frac{l}{\rho V^2 c}=\alpha\Rightarrow l=\alpha\rho V^2 c,$

which proves formula (1), for $C_L=2\alpha$.

This example with only one $pi$-group is completely equivalent with the technique of looking at the dimensions of the variables $\rho$ ($\mbox{kg}/\mbox{m}^2$), $V$ ($\mbox{m}/\mbox{s}$) and $c$ (m) and trying to find a product $\rho^aV^bc^d$ that has the same dimensions of $l$ ($\mbox{N}/\mbox{m}=\mbox{kg}/\mbox{s}^2$).

# DVB-T dipole antenna

DVB-T (Digital Video Broadcasting – Terrestrial) is a European standard for air broadcast transmission of digital television. Most televisions are sold with an internal DVB-T decoder, including mine, such that you only need to connect an antenna to the television set. The picture below shows the rear connectors of my television set, with a $75\Omega$ coaxial cable input to connect the antenna to the television.

How long should the dipole antenna be?

Note that the label next to the antenna input on my television set reads $75\Omega$. This means that the input behaves as a $75\Omega$ resistance at the television signal frequencies (several hundred Megahertz). As this resistance is fixed (i.e. we cannot change it), we can only maximize the signal strength the television receives by maximizing the current we pump in it (this evidently maximizes the voltage over the input terminals).

Predicting the actual current that will flow through this resistor in a real-life environment, with the dipole antenna surrounded by obstacles such walls, furniture, windows or metallic objects (and, of course, the television itself!), turns out to be a very difficult problem. In essence, you’d have to solve Maxwell’s equations for the full three-dimensional setup. Unless you recently acquired a supercomputer, this idea might be less promising than initially thought. We can ask ourselves a natural other question: what will the current be through a $75\Omega$ resistor connected to the antenna terminals if there are absolutely no obstacles around it, i.e. in the hypothetical case the antenna is embedded in a free space medium (see picture below). We can expect that this situation models reality pretty well if the antenna is sufficiently well separated from any obstacles.

Fortunately, we can now simulate (i.e. solve the Maxwell equations numerically) the behavior of the antenna in a reasonable amount of time, loaded with $75\Omega$ in the simplified environment. Let us assume that an electromagnetic plane wave excitation with frequency $f$ impinges on the above structure. It can be proven that the antenna behaves as a Thévenin equivalent circuit with open circuit voltage $V_0$ and input impedance $R_a + j X_a$ (see diagram below).

The curves of the radiation resistance $R_a$ and the radiation reactance $X_a$ as a function of the length of the dipole antenna are shown below (source: http://en.wikipedia.org/wiki/Dipole_antenna).

Now, remember we wanted to maximize signal strength, i.e. the voltage $V_L$ over the $75\Omega$ resistor (representing the television input signal), and this can be accomplished by tuning the length $L$ of the dipole antenna. It can be shown that the open circuit voltage $V_0$ is proportional to $\sqrt{R_a}$ (at least if the antenna gain in the direction of arrival of the plane wave doesn’t change much for the considered length range, which is a reasonable assumption). This lets us conclude that, with $R_L=75\Omega$,

$|V_L|\sim \displaystyle\frac{R_L\sqrt{R_a}}{\sqrt{(R_L+R_a)^2+X_a^2}}.$

Maximizing this w.r.t. $L$ (for the impedance curves above) gives us the following length for the dipole antenna, for maximal signal strength:

$L\approx0.47\lambda=0.47\displaystyle\frac{c}{f}.$

Here $c=299792458\mbox{m/s}$ denotes the speed of light in vacuo, while $f$ denotes the frequency of the electromagnetic excitation.

As an example, in the city I live, the DVB-T channels are being broadcast around $f=482\mbox{MHz}$, which means the total length of the dipole antenna should be $29.2\mbox{cm}$.

As a conclusion, I soldered two pieces of copper wire to the end of a $75\Omega$ coaxial cable (which is a standard impedance level for television applications), and connected the other end to the television set (image soon following). The distance between the endpoints of both wires was adjusted to the desired length of $29.2\mbox{cm}$.

Wait, shouldn’t you use balun?

By connecting the symmetrical (w.r.t. to the plane perpendicular to its wires and through its middle) dipole antenna to an asymmetrical (w.r.t. this very same plane) coaxial cable, the whole electromagnetic problem becomes asymmetrical. This can be seen in the picture below (from http://en.wikipedia.org/wiki/Dipole_antenna), where the left dipole wire is soldered to the inner conductor, and the right dipole wire is soldered to the outer conductor of the coaxial cable. In the previous situation (in free space, with a resistor connected to both terminals), an impinging electromagnetic field induced an opposite (and thus equal in magnitude) current through the dipole terminals. Now the current through the dipole terminals will be different in magnitude, because of an extra current on the outside of the outer conductor of the coaxial cable.

From these considerations, we can already conclude that the characteristics of the configuration dipole + coaxial cable will differ from the situation with dipole only. The radiation pattern of the configuration with coaxial cable (and driven at the other end of the cable) will be different. More importantly, if the antenna is used for receiving, with a $75\Omega$ load at the other end of the coaxial, the Thévenin equivalent of the configuration dipole + coaxial cable will be different from the free space case, ruining a bit our previous optimization w.r.t. the wire length.

Our previous lines of thought are confirmed by some very nice numerical simulations, found at http://www.g0ksc.co.uk/creatingabalun.html. As can be seen below, the current distribution along the dipole wires has a different magnitude at its terminals, because a relatively large current is flowing at the outside of the outer coax conductor.

The solution to this problem is to connect a balun between the dipole antenna and the coaxial cable, effectively forcing the current at the terminals to be equal in magnitude by discouraging a common mode current through the coaxial cable.

However, I would recommend to only use a balun if the signal quality still is too low. In my case, the situation without balun worked just fine.