salt

How to estimate the total number of Nazi Tanks?

Historical Context

During World War II, Allied intelligence faced the challenge of estimating German tank production. This led to the development of statistical methods that significantly outperformed traditional intelligence gathering.

Here's how the simulation works:

We have a secret number of tanks (500 in this case).
We pretend to "capture" 5 tanks and look at their serial numbers.
Based on these 5 numbers, we try to guess the total number of tanks.
We repeat this process 1000 times to see how good our guessing methods are.

Here's the strategy:

The "Simple" method (MLE): We just use the highest number we see.
The "Smart" method (Unbiased): We use a slightly more complicated calculation that tries to account for the tanks we didn't see.

Observations:

The "Simple" method (blue) tends to guess too low. Its average guess is about 416 tanks, which is less than the real 500.
The "Smart" method (orange) does better. Its average guess is about 498 tanks, very close to the real 500!
But notice how the orange bars are more spread out. This means the "Smart" method can sometimes be way off, even though it's better on average.
The "Simple" method is more consistent (the blue bars are more bunched together), but it's consistently too low.

Estimation Methodology

1. Basic Maximum Likelihood Estimator

The simplest approach uses the maximum observed serial number (m) as an estimator:

N̂ = m

While simple, this estimator is biased low, as P(N̂ ≤ N) = 1.

Improved Estimators

Sample Maximum Plus Average Gap

A more sophisticated estimator adds the average gap between observed serial numbers:

N̂ = m + (m - k) / k

Where:

m: maximum observed serial number
k: number of observed samples

This can be interpreted as the maximum plus the average gap, providing a less biased estimate.

Derivation from Order Statistics

The estimator can be derived from order statistics. For a sample of size k from a uniform discrete distribution on {1, ..., N}:

E[m] = N * k / (k + 1)

Solving for N yields the unbiased estimator:

N̂ = m * (k + 1) / k - 1

Probability Analysis

The probability of observing a specific set of serial numbers {s₁, ..., sₖ} given N tanks is:

P({s₁, ..., sₖ} | N) = k! / (N * (N-1) * ... * (N-k+1))

Maximizing this probability (or its logarithm) with respect to N yields the maximum likelihood estimator.

1mo ago9.5K views

Dangerman

Infosys1mo

I watched this video on numberphile. I love maths

salt

Gojek1mo

Yeah it’s amazing hahaha

IseeDarknessInLight

Tao1mo

I hate statistics.

salt

Gojek1mo

Okay

Harshtech

Deloitte1mo

What do you do bro as a profession

salt

Gojek1mo

Background in Computer Science. Programming, Stats, Data, Product throughout my career

Whitefang

Stealth1mo

Bhai @salt

salt

Gojek1mo

Arey bru I’m sorry 😞

VioletMovie

Accenture1mo

Damn bro wish I was intelligent enough to understand this

salt

Gojek1mo

Hehehehe

Discover more

Curated from across

Gooner7Goldman Sachs5mo

This is how you can also do complex modelling ands statistical analysis easily...

I will share the next resource/paper when this post hits 100 likes.

I was talking to my friend who is doing his PhD. from University of Maryland, College Park and I asked him what should a new learner of Statistics learn so that they ...

stat.berkeley.edu

An Introduction to Statistical Learning Gareth James Daniela Witten Trevor Hastie Robert Tibshirani with Applications in R

This book is appropriate for anyone who wishes to use contemporary tools for data analysis...

4.4K views

coalaAmerican Express7mo

freshers/people looking for switch, check out noon.com

noon is an e-commerce platform in the middle east market. They operate from gurgaon - standard pay, nice folks . Gave an interview recently, if you know sql and you are good with storytelling, you are golden.

instead of directly applyin...

samosaStealth9mo

Privacy Terms

Guidelines Help

How to estimate the total number of Nazi Tanks?

This is how you can also do complex modelling ands statistical analysis easily...

stat.berkeley.edu

An Introduction to Statistical Learning Gareth James Daniela Witten Trevor Hastie Robert Tibshirani with Applications in R

freshers/people looking for switch, check out noon.com

Proud Moment: Bharat is world no. 1 in spreading fake news.

What is your heuristic to select the number of samples for a classification problem?

Motivate me to Start-up

This is how you can also do complex modelling ands statistical analysis easily...

stat.berkeley.edu

An Introduction to Statistical Learning Gareth James Daniela Witten Trevor Hastie Robert Tibshirani with Applications in R

freshers/people looking for switch, check out noon.com

Proud Moment: Bharat is world no. 1 in spreading fake news.

What is your heuristic to select the number of samples for a classification problem?

Motivate me to Start-up