Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's an logarithmic fact that I've made use of frequently:

If X is a random variable having a uniform distribution between zero and one, then –ln(X)/λ has an exponential distribution with rate λ.

This relationship comes in handy when, for example, you want to draw weighted random samples. Or generating event times for simulations.



The general version of this is called inverse transform sampling [0], which uses the fact that for the cdf F of any random variable X the random variable Y = F(X) has a standard uniform distribution [1]. Since every cdf increases monotonically on the unit interval, every cdf is invertible [2]. So apply the inverse cdf to both sides of the previous equation and you get F^-1(Y) = X is distributed like X.

Sampling from a standard uniform distribution and then using the inverse transform is the commonest way of generating random numbers from an arbitrary distribution.

0. https://en.m.wikipedia.org/wiki/Inverse_transform_sampling

1. https://en.m.wikipedia.org/wiki/Probability_integral_transfo...

2. Not every cdf is one-to-one, however, so you may need a generalized inverse.


For the particular case of the exponential distribution we can go further. By taking advantage of the theory of Poisson processes, we can take samples using a parallel algorithm. It even has a surprisingly succinct SQL translation:

    SELECT *
    FROM Population
    WHERE weight > 0
    ORDER BY -LN(1.0 - RANDOM()) / weight
    LIMIT 100  -- Sample size.
Notice our exponentially distributed random variable on prominent display in the ORDER BY clause.

If you're curious, I explore this algorithm and the theory behind it in https://blog.moertel.com/posts/2024-08-23-sampling-with-sql....


Quite off-topic, but do you know when you'll write the article about CPS, if ever?


Oops. I had quite forgotten that I need to write about that. I said I would over a decade ago, so that's a long time for you to wait. Sorry about that.

I mainly write for myself, so I need the time and the motivation. Until recently, my job at G took up my time and also provided an internal community where I could scratch the writing itch, which reduced the motivation for public writing on my blog. But now that I'm semi-retired, I'll try to write more frequently.

Thanks for the accountability!


Inverse transform sampling is a special case of normalizing flow where we don't need to learn anythin.g

https://en.wikipedia.org/wiki/Flow-based_generative_model


How long do I have to study math to understand this?


Most textbooks on probability have some discussion of the relationships between the various distributions commonly in use. If you just want a quick overview, I found John D. Cook’s diagram to be handy:

https://www.johndcook.com/blog/distribution_chart/


Possibly unhelpful answer: Arguably none! This is presented as a surprising fact, but you could easily argue that this is the proper definition of an exponential distribution. If you do

x = -log(rand())/lambda

a bunch a times, that comes from something, right? Well, let's call that something an exponential distribution.

From this perspective, the thing that actually needs math is finding the density function of the exponential distribution. (For that, in theory you just need calc 101 and probability 101.)


I love that you asked this, but I think it's not quite the right question.

I've been wishing for years that someone would maintain a "dependency graph" for mathematical concepts. I think Khan Academy tried to do something like this at one point but took it down a long time ago. As long as we're far enough from the bleeding-edge of research topics, I feel like maths is the one field where this might be possible to do really well. You should be able to point at something you don't understand and trace backwards in a very granular fashion until you hit a concept you already know/understand, and then learn what you need to fill in the gap!


I had wanted to build a website like this as a hobby, except not limited strictly to mathematical concepts. I wanted to build a graph of the skills/concepts you’d need to understand electrical impedance, fix an air conditioning system, bake bread reliably.

One source of information could be course syllabi, which describe a progression of topics and their prerequisites.

Rather than be the authority on what concepts must precede others, I envisioned making a system that could represent different educational approaches: not every educator agrees that Calculus I ought to be a prerequisite for studying Physics I. Not all bread recipes use yeast.

I had a hard time finding a good domain for this effort. “Tree of knowledge” dot TLD was taken.


That's more-or-less the purpose of https://mathworld.wolfram.com/ or at least, it's significantly better at it than, say, wikipedia.


Unless I'm missing something, this can just be directly verified, no “understanding” necessary. All you need to know is that probability distributions can be characterized by their probability density function (PDF).

If Y=-ln(X)/lambda, then P(Y<a) = P(-ln(X)/lambda<a) = P(X>exp(-lambda a)) = 1-exp(-lambda a).

And if Z is exponential with rate parameter lambda, then P(Z<a) = P(lambda exp(-lambda t)<a) = integral from 0 to a of lambda exp(-lambda t)dt, which can be directly computed to be 1-exp(-lambda a).

They have the same PDF, so they're the same distribution.


I mean if starting from scratch that seems like many years in most western education systems to get to probability, logarithms, exponentiation.

I would say If you knew 2+2=4, and not much else you're years away from 'understanding', if you know ln(exp(y)) = y, and P(x>0.5) = 0.5 for a uniform distribution on [0, 1) then you don't need any additional understanding.

I would bet the GP comment is somewhere inbetween the two extremes, but I think a random sampling of the population would likely result in people generally not knowing the log / exponentiation relation, or anything about the uniform distribution.


Yea got many answers and I dont understand a single one. Good thing you barely need math in programming.


Not long. We were taught logs, and use of log tables, at Middle School. So probably around about age 11.

I also vaguely recall a couple of lessons where we went over Napier's Bones, and they had us produce equivalents on pieces of paper to cut out and move around.

I believe I still have my school day log tables around somewhere. I'd just have to practice for 1/2 hr to remind myself how to use them. That said, they did have usage instructions in the last few pages.


Look im 30, most people I know have forgotten all of school math long ago, me included. Entry barrier too big now.


Having scanned through his book, it makes it seem overly complex.

At middle school, this was taught after having only done simply arithmetic and learning about fractions (rational numbers) in primary school, then decimal fractions in middle school.

The use of logs from the tables was simply a set of basic rules for how to apply them to a few scenarios. I can't recall if it covered trig with those tables, but I doubt it.

I learnt as a child between 10 (when I started middle school), and say around 12 at most. I've forgotten the use, but vaguely recall some way of representing negative numbers in the table (n-bar, with a bar above the digit).

I'm way over 30. I never used log tables after high school, and have forgotten the rules for usage, but recall it didn't take long to learn the first time. *

However for simple uses (multiplication and division) I'd expect I'd be able to pick it up again in at most a weeks worth of practice. It would be made a lot easier now by being able to compare and check calculations with a computer or pocket calculator.

I'd expect any adult able to program a computer to also be able to pick it up in a similar period, or at most a month.

Remember we used to teach this to kids, and expect them to be able to pick it up (if not be accurate in application) in under a weeks worth of lessons.

* Note I didn't even know how to do long multiplication when I learnt, as due to political interference with teaching curriculum, I'd not been taught at primary school.


Khan academy could get you through it, in I'd guess a month or three, depending on your time available.


Depends where you're starting from but from highschoolish maths you can probably sort this out in a few hours or days.


All you need is plot -log(x) for x between 0 and 1 and you will see that log(x) transforms a uniform line into an exponential decay (towards 1). Its being said in a fancy way. This is also the origin of the log likelihood loss function in ML


It seems like there's always more than one way to skin a cat, but I'd have turned to calculus, had I needed to derive something like this and didn't think to look it up (e.g., before the Internet).


If you study calculus and introduction to probability theory, then you're ready to learn this. So the answer is about 2 years after high school.


to understand what they said, or to understand a proof of why it would be true?

any stats class would be enough to understand what they said


I know what all these words mean, it "makes sense" to me in the sense that I read it and I think "ok.." but I wouldn't have the slightest idea how to use this to get weighted random samples or "generate event times."

So I guess I "understand it" in the sense that it doesn't sound like a foreign language, but I can't apply it in any meaningful way.


Depends on how much you practiced high school math. It's not hard but we forget it without practice.


Ask chatgpt to explain it to you like you're 18. It made it really easy to understand.


One way to understand why without writing down the CDF/PDF:

When X is an exponential variable and c is a constant, X + c has the same distribution as X after conditioning on large outcomes. In other words, these two variables have same "tail." This is true exactly for exponential distributions. (Sometimes this is called "memorylessness.")

Similarly, when U has a uniform distribution on [0, 1] and c is a constant, cU has the same distribution as U after conditioning on small outcomes.

But if cU is distributed like U near 0, then -ln(c U) is distributed like -ln(U) near infinity. But -ln(c U) = -ln(c) - ln(U), so the tail of -ln(U) doesn't change when we add a constant, meaning it must have an exponential distribution.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: