Faster Way To Threshold A 4-D Numpy Array
I have a 4D numpy array of size (98,359,256,269) that I want to threshold. Right now, I have two separate lists that keep the coordinates of the first 2 dimension and the last 2 dimensions. (mag_ang for the first 2 dimensions and indices for the last 2).
size of indices : (61821,2)
size of mag_ang : (35182,2)
Currently, my code looks like this:
inner_points =  for k in indices: x = k y = k for i,ctr in enumerate(mag_ang): mag = ctr ang = ctr if X[mag][ang][x][y] > 10: inner_points.append((y,x))
This code works but it's pretty slow and I wonder if there's any more pythonic/faster way to do this?s
(EDIT: added a second alternate method)
Use numpy multi-array indexing:
import time import numpy as np n_mag, n_ang, n_x, n_y = 10, 12, 5, 6 shape = n_mag, n_ang, n_x, n_y X = np.random.random_sample(shape) * 20 nb_indices = 100 # 61821 indices = np.c_[np.random.randint(0, n_x, nb_indices), np.random.randint(0, n_y, nb_indices)] nb_mag_ang = 50 # 35182 mag_ang = np.c_[np.random.randint(0, n_mag, nb_mag_ang), np.random.randint(0, n_ang, nb_mag_ang)] # original method inner_points =  start = time.time() for x, y in indices: for mag, ang in mag_ang: if X[mag][ang][x][y] > 10: inner_points.append((y, x)) end = time.time() print(end - start) # faster method 1: inner_points_faster1 =  start = time.time() for x, y in indices: if np.any(X[mag_ang[:, 0], mag_ang[:, 1], x, y] > 10): inner_points_faster1.append((y, x)) end = time.time() print(end - start) # faster method 2: start = time.time() # note: depending on the real size of mag_ang and indices, you may wish to do this the other way round ? found = X[:, :, indices[:, 0], indices[:, 1]][mag_ang[:, 0], mag_ang[:, 1], :] > 10 # 'found' shape is (nb_mag_ang x nb_indices) assert found.shape == (nb_mag_ang, nb_indices) matching_indices_mask = found.any(axis=0) inner_points_faster2 = indices[matching_indices_mask, :] end = time.time() print(end - start) # finally assert equality of findings inner_points = np.unique(np.array(inner_points)) inner_points_faster1 = np.unique(np.array(inner_points_faster1)) inner_points_faster2 = np.unique(inner_points_faster2) assert np.array_equal(inner_points, inner_points_faster1) assert np.array_equal(inner_points, inner_points_faster2)
0.04685807228088379 0.0 0.0
(of course if you increase the shape the time will not be zero for the second and third)
Final note: here I use "unique" at the end, but it would maybe be wise to do it upfront for the
mag_ang arrays (except if you are sure that they are unique already)
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module