Explore Companies

Software Engineers Community

by Coy Carmden

10 days ago

How to estimate the total number of Nazi Tanks?

**Historical Context**

During World War II, Allied intelligence faced the challenge of estimating German tank production. This led to the development of statistical methods that significantly outperformed traditional intelligence gathering.

Here's how the simulation works:
1. We have a secret number of tanks (500 in this case).
2. We pretend to "capture" 5 tanks and look at their serial numbers.
3. Based on these 5 numbers, we try to guess the total number of tanks.
4. We repeat this process 1000 times to see how good our guessing methods are.

Here's the strategy:
- The "Simple" method (MLE): We just use the highest number we see.
- The "Smart" method (Unbiased): We use a slightly more complicated calculation that tries to account for the tanks we didn't see.

Observations:
1. The "Simple" method (blue) tends to guess too low. Its average guess is about 416 tanks, which is less than the real 500.
2. The "Smart" method (orange) does better. Its average guess is about 498 tanks, very close to the real 500!
3. But notice how the orange bars are more spread out. This means the "Smart" method can sometimes be way off, even though it's better on average.
4. The "Simple" method is more consistent (the blue bars are more bunched together), but it's consistently too low.

**Estimation Methodology**

**1. Basic Maximum Likelihood Estimator**

The simplest approach uses the maximum observed serial number (m) as an estimator:

N̂ = m

While simple, this estimator is biased low, as P(N̂ ≤ N) = 1.

**Improved Estimators**

**Sample Maximum Plus Average Gap**

A more sophisticated estimator adds the average gap between observed serial numbers:

N̂ = m + (m - k) / k

Where:
- m: maximum observed serial number
- k: number of observed samples

This can be interpreted as the maximum plus the average gap, providing a less biased estimate.

**Derivation from Order Statistics**

The estimator can be derived from order statistics. For a sample of size k from a uniform discrete distribution on {1, ..., N}:

E[m] = N * k / (k + 1)

Solving for N yields the unbiased estimator:

N̂ = m * (k + 1) / k - 1

**Probability Analysis**

The probability of observing a specific set of serial numbers {s₁, ..., sₖ} given N tanks is:

P({s₁, ..., sₖ} | N) = k! / (N * (N-1) * ... * (N-k+1))

Maximizing this probability (or its logarithm) with respect to N yields the maximum likelihood estimator.

Jordon Taye

Infosys

9 days ago

I watched this video on numberphile. I love maths

Dezi Gabriel

Gojek

9 days ago

Yeah it’s amazing hahaha

Kalan Vernon

Accenture

9 days ago

Damn bro wish I was intelligent enough to understand this

Isaiah Lee

Gojek

9 days ago

Hehehehe

Kendall Lee

Stealth

9 days ago

Bhai @salt

Dezi Gabriel

Gojek

9 days ago

Arey bru I’m sorry 😞

Jordon Nadeen

Deloitte

9 days ago

What do you do bro as a profession

Jordon Vernon

Gojek

9 days ago

Background in Computer Science. Programming, Stats, Data, Product throughout my career

See more comments

Isaiah Carmden

Tao

9 days ago

I hate statistics.

Jordon Carmden

Gojek

9 days ago

Okay

Sign in to a Grapevine account for the full experience.

Discover More

Curated from across

Data Scientists on

by Blair Nadeen

Goldman Sachs

This is how you can also do complex modelling ands statistical analysis easily...

This book is appropriate for anyone who wishes to use contemporary tools for data analysis...

https://www.stat.berkeley.edu/users/rabbee/s154/ISLR_First_Printing.pdf

Software Engineers on

by Kendall Lee

Scaler

freshers/people looking for switch, check out noon.com

News Discussion on

by Kendall Carmden

Stealth

Proud Moment: Bharat is world no. 1 in spreading fake news.

Data Scientists on

by Jordon Denver

Google

What is your heuristic to select the number of samples for a classification problem?

Indian Startups on

by Blair Lee

Stealth

Motivate me to Start-up

Misc on

by Kendall Carmden

TripAdvisor

Stupid question asked in interview.

Business Roles on

by Jordon Lee

Stealth

Guesstimate: how many tweets go without likes in a day

Software Engineers on

by Anise Olive

Gojek

Daily Series #2: Geeking out → Monte Carlo Simulations

Data Scientists on

by Kendall Carmden

Gojek

[Interesting Read] Using Data Science to track a rare COVID-19 variant

Product Managers on

by Matilda Carmden

Student

Drop every guesstimate you got asked in an interview. PS- have an interview tomorrow, I know they will ask for guesstimates in the interview.

FAANG on

by Coy Lee

Microsoft

Internal Email: Google Employees arrested and fired for protests

Google terminated 28 employees Wednesday, according to an internal memo viewed by CNBC, after a series of protests against Project Nimbus.

https://www.cnbc.com/2024/04/18/google-terminates-28-employees-after-series-of-protests-read-the-memo.html

Data Scientists on

by Jordon Olive

Google

Asking Humans vs GPT to choose a number between 1-10

News Discussion on

by Kendall Dean

Optum

Map of the Month !!

Software Engineers on

by Aaron Vernon

Stealth

How do you quantify impact as a SWE?

Misc on

by Kalan Olive

Stealth

How did people survive Holocaust?

Misc on

by Jordon Denver

Plivo

Is my observation correct?

Personal Finance on

by Jordon Everett

Gojek

[Analysis] What do you think about market efficiency versus seeking alpha in trading?

Home
How to estimate the total number of Nazi Tanks?

Download the Grapevine app.

Help & Support support@gvine.app

Privacy Policy Community Guidelines

Grapevine™ 2024, All rights reserved