Pandas Read_csv: Decimal And Delimiter Is The Same Character
Recently I'm struggling to read an csv file with pandas pd.read_csv. The problem is, that in the csv file a comma is used both as decimal point and as separator for columns. The csv looks as follows:
wavelength,intensity
390,0,382
390,1,390
390,2,400
390,3,408
390,4,418
390,5,427
390,6,437
390,7,447
390,8,457
390,9,468
Pandas accordingly always splits the data into three separate columns. However the first comma is only the decimal point. I want to plot it with the wavelength (x-axis) with 390.0, 390.1, 390.2 nm and so on.
I must somehow tell pandas, that the first comma in line is the decimal point, and the second one is the separator. How do I do this?
Best
Answer
I'm not sure that this is possible. It almost is, as you can see by the following example:
>>> pd.read_csv('test.csv', engine='python', sep=r',(?!\d+$)')
wavelength intensity
0 390 0,382
1 390 1,390
2 390 2,400
3 390 3,408
4 390 4,418
5 390 5,427
6 390 6,437
7 390 7,447
8 390 8,457
9 390 9,468
...but the wrong comma is being split. I'll keep trying to see if it's possible ;)
Meanwhile, a simple solution would be to take advantage of the fact that that pandas puts part of the first column in the index:
df = (pd.read_csv('test.csv')
.reset_index()
.assign(wavelength=lambda x: x['index'].astype(str) + '.' + x['wavelength'].astype(str))
.drop('index', axis=1)
.astype({'wavelength': float}))
Output:
>>> df
wavelength intensity
0 390.0 382
1 390.1 390
2 390.2 400
3 390.3 408
4 390.4 418
5 390.5 427
6 390.6 437
7 390.7 447
8 390.8 457
9 390.9 468
EDIT: It is possible!
The following regular expression with a little dropna column-wise gets it done:
df = pd.read_csv('test.csv', engine='python', sep=r',(!?\w+)$').dropna(axis=1, how='all')
Output:
>>> df
wavelength intensity
0 390,0 382
1 390,1 390
2 390,2 400
3 390,3 408
4 390,4 418
5 390,5 427
6 390,6 437
7 390,7 447
8 390,8 457
9 390,9 468
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module