Ad
Pandas.read_table - Preceding Zeros Of Numbers Are Removed
When I read a file which contains in a column int numbers with preceding zeros into a dataframe, then the zeros are removed. How can I prevent this?
Example:
file: "test.txt
" has the following content:
one two three
a 025700 's'
b 005930 7
cc 125945 hi
ddd 000003 9.0
Now I am reading it into a dataframe:
import pandas as pd
filename = "test.txt"
df = pd.read_table(filename, sep=" ")
The output is:
print(df)
one two three
0 a 25700 's'
1 b 5930 7
2 cc 125945 hi
3 ddd 3 9.0
I would like to have as the content of the dataframe second column the same content as in the file:
one two three
0 a 025700 's'
1 b 005930 7
2 cc 125945 hi
3 ddd 000003 9.0
Ad
Answer
Use dtype
parameter:
df = pd.read_table(filename, sep=" ", dtype={'two': str})
print(df)
# Output
one two three
0 a 025700 's'
1 b 005930 7
2 cc 125945 hi
3 ddd 000003 9.0
Or if you don't want Pandas to infer your data types:
df = pd.read_table(filename, sep=" ", dtype=object)
Ad
source: stackoverflow.com
Related Questions
- → Length of Values Not Matching Length of Index
- → Historical price per minute between two timestamps for a cryptocurrency
- → How to add on to parameter names in functions?
- → Errors when load selected raw data into dataframe
- → Error using Santiment sanpy library for cryptocurrency data analysis
- → Trying to expand row data and convert to DataFrame, getting this error: AttributeError: 'float' object has no attribute 'keys'
- → How to retrieve and store multiple values from a python Data Frame?
- → Finding the closest date given a date in a groupby dataframe (Python)
- → Converting year and day of year into datetime index in pandas
- → Dates to Durations in Pandas
- → Stripping all trailing empty spaces in a column of a pandas dataframe
- → issue with cross_validation in pandas_ml
- → Stack columns in pandas dataframe to achieve record format
Ad