Ad

File Modification Times Not Equal After Calling Shutil.copystat(file1, File2) Under Windows

I run the following code with Python 2.7.5. under Windows:

import os, shutil, stat, time

with open('test.txt', 'w') as f: pass # create an arbitrary file
shutil.copy('test.txt', 'test2.txt') # copy it
shutil.copystat('test.txt', 'test2.txt') # copy its stats, too

t1 = os.lstat('test.txt').st_mtime # get the time of last modification for both files
t2 = os.lstat('test2.txt').st_mtime

print t1 # prints something like: 1371123658.54
print t2 # prints the same string, as expected: 1371123658.54
print t1 == t2 # prints False! Why?!

I expect both timestamps (=floats) to be equal (as their string representations suggest), so why does t1 == t2 evaluate to False?

Also, I was unable to reproduce this behaviour with less code, i.e. without comparing the timestamps retrieved via os.lstat from two different files. I have the feeling, I am missing something trivial here...


Edit: After further testing I noticed, that it does print True once in a while, but not more often than once every 10 runs.


Edit 2: As suggested by larsmans:

print ("%.7f" % t1) # prints e.g. 1371126279.1365688
print ("%.7f" % t2) # prints e.g. 1371126279.1365681

This raises two new questions:

  1. Why are the timestamps not equal after calling shutil.copystat?
  2. print rounds floats by default?!
Ad

Answer

The problem is with conversion between different formats during the copystat call. This is because Windows stores file times in a fixed-point decimal format, while Python stores them in a floating-point binary format. So each time there is a conversion between the two formats, some accuracy is lost. During the copystat call:

  1. A call to os.stat converts the Windows format to Python's floating-point format. Some accuracy is lost.
  2. os.utime is called to update the file time. this converts it back to the Windows format. Some accuracy is lost again, and the file time is not necessarily the same as the first file's.

When you call os.lstat yourself, a third inaccurate conversion is performed. Due to these conversions, the file times are not exactly the same.

The documentation for os.utime mentions this:

Note that the exact times you set here may not be returned by a subsequent stat() call, depending on the resolution with which your operating system records access and modification times


Regarding your second question (why print appears to show the same values for both): Converting a floating-point value to a string with str(f) or print f will round the value. To get a value guaranteed to be unique for different floating-point values, use print repr(f) instead.

Ad
source: stackoverflow.com
Ad