Ad

Convert Text File To Tiff File In Python

- 1 answer

I am converting text file to tiff by using the following code but it is not working when text file content starts with special characters. I don't know why it is not working. Could you please anyone help me to do this task

def main():
    image = text_image('/Users/administrator/Desktop/367062657_1.text')
    image.show()
    image.save('contentok.tiff')

def text_image(text_path, font_path=None):

    grayscale = 'L'
    # parse the file into lines
    with open(text_path) as text_file:
        lines = tuple(l.rstrip() for l in text_file.readlines())

    large_font = 20
    font_path = font_path or 'cour.ttf'  
    try:
        font = PIL.ImageFont.truetype(font_path, size=large_font)
    except IOError:
        font = PIL.ImageFont.load_default()
        print('Could not use chosen font. Using default.')
    pt2px = lambda pt: int(round(pt * 96.0 / 72))
    max_width_line = max(lines, key=lambda s: font.getsize(s)[0])
    test_string = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    max_height = pt2px(font.getsize(test_string)[1])
    max_width = pt2px(font.getsize(max_width_line)[0])
    height = max_height * len(lines) # perfect or a little oversized
    width = int(round(max_width + 5))  # a little oversized

    image = PIL.Image.new(grayscale, (width, height), color=PIXEL_OFF)
    draw = PIL.ImageDraw.Draw(image)
    vertical_position = 5
    horizontal_position = 5
    line_spacing = int(round(max_height * 1.0))
    for line in lines:
        draw.text((horizontal_position, vertical_position),
                  line, fill=PIXEL_ON, font=font)

        vertical_position += line_spacing
    c_box = PIL.ImageOps.invert(image).getbbox()
    image = image.crop(c_box)`enter code here`
    return image

Error:

Error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf2 in position 18: invalid continuation byte

Ad

Answer

Your problem is here:

with open(text_path) as text_file:
    lines = tuple(l.rstrip() for l in text_file.readlines())

You're loading the text file as text (which is by default UTF-8) when the data in it is not compatible to UTF-8, according to the error you mentioned in your comments.

You should open your file with a specified encoding that matches the data. See the docs here

Basically something like this should work:

with open(text_path, encoding='windows-1255') as text_file:
    lines = tuple(l.rstrip() for l in text_file.readlines())

But of course windows-1255 is just a guess from me... you should know how your files are encoded, and take a look here for a list of avaliable values

Ad
source: stackoverflow.com
Ad