Ad

Fastest Way To Convert A Binary String To Binary Array (Array Of 1 And 0)

I am trying to find the fastest possible way to convert a binary string to an array of integer0 and 1. I am currently using python 3.8, and have the following two functions to obtain such array:

import numpy as np
from typing import Literal, Sequence
def string_to_array(Bin_String):
    Bin_array=[int(Bin_String[i],2) for i in range(len(Bin_String))]
    return Bin_array

def string_to_array_LtSq(string: Sequence[Literal['0', '1']]) -> np.ndarray:
    return np.array([int(c) for c in string])

For a string of length 1024, string_to_array_LtSq function takes 20 micro-seconds less than the other (average 370 micro-seconds) though I don't understand why it is faster since both are using int function.

But this is an important part of the code, so is there a faster way in python?

Also, is it possible to do faster in any other language (for example c)? I might switch to that language.

Thanks.

Related Post:

  1. Convert Bitstring (String of 1 and 0s) to numpy array
Ad

Answer

bytearray appears to be even faster than Andrej's NumPy solution. And bytes can be used for a fast list solution. Times with 1024 bits (only showing the first 5):

f1   2.7 μs  [1 0 1 1 1]
f2   2.0 μs  bytearray(b'\x01\x00\x01\x01\x01')
f3   7.6 μs  [1, 0, 1, 1, 1]

Code based on Andrej's (Try it online!):

import numpy as np
from timeit import timeit

s = "1011" * 256  # length = 1024


def f1():
    return np.frombuffer(s.encode("ascii"), dtype="u1") - 48


table = bytearray.maketrans(b'01', b'\x00\x01')

def f2():
    return bytearray(s, "ascii").translate(table)


def f3():
    return [*s.encode().translate(table)]


for _ in range(3):
    for f in f1, f2, f3:
        t = timeit(f, number=1_000)
        t = '%5.1f μs ' % (t * 1e3)
        print(f.__name__, t, f()[:5])
    print()
Ad
source: stackoverflow.com
Ad