Ad

Faster Alternatives To Using Numpy.random.choice In Python?

- 1 answer

My goal is to generate a large 2D array in Python where each number is either a 0 or 1. To do this, I created a nested for-loop as shown below:

    for count in range(0,300):
      block = numpy.zeros((8,300000))

      for a in range(0,8):
        for b in range(0,300000):
          block[a][b] = numpy.random.choice(2,1, p=[0.9,0.1])

The block has a 90% chance of picking a "0" and a 10% of picking a "1". But it takes over 1 minute for the outer for loop to process once. Is there a more efficient way to pick random numbers for a large number of arrays while stilling being able to use the "P" values? (This is my first post so sorry if the formatting is broken)

Ad

Answer

The idea behind NumPy is to not loop through 720000000 iterations at Python level. You're supposed to use whole-array operations, like having numpy.random.choice generate an entire array of choices in one call:

block = numpy.random.choice(2, size=(8, 300000), p=[0.9, 0.1])

This completes almost instantly.

Ad
source: stackoverflow.com
Ad