[PYTHON] pandas to_csv : csv에 팬더를 쓸 때 CSV 파일의 과학 표기법을 사용하지 않습니다.
PYTHONpandas to_csv : csv에 팬더를 쓸 때 CSV 파일의 과학 표기법을 사용하지 않습니다.
나는 csv에 판다 df를 쓰고있다. 내가 csv 파일에 쓸 때 열 중 하나의 요소 중 일부가 과학 표기법 / 숫자로 잘못 변환되고 있습니다. 예를 들어, col_1에 '104D59'와 같은 문자열이 있습니다. 문자열은 대개 csv 파일에서 문자열로 표시됩니다. 그러나 '104E59'와 같은 임시 문자열은 과학 표기법 (예 : 1.04 E 61)으로 변환되어 이후의 CSV 파일에 정수로 표시됩니다.
csv 파일을 소프트웨어 패키지 (즉, pandas -> csv -> software_new)로 내보내려고하는데이 데이터 유형의 변경으로 인해 해당 내보내기에 문제가 발생했습니다.
csv에 df를 쓰는 방법이 있나요? df [ 'problem_col']의 모든 요소가 결과 CSV에서 문자열로 표시되거나 과학 표기법으로 변환되지 않도록할까요?
다음은 csv에 팬더 df를 작성하는 데 사용한 코드입니다. df.to_csv ( 'df.csv', 인코딩 = 'utf-8')
또한 문제 열의 dtype을 확인합니다. df.dtype에 대해 df [ 'problem_column']는 객체입니다.
해결법
-
==============================
1.float_format 인수를 사용하십시오.
float_format 인수를 사용하십시오.
In [11]: df = pd.DataFrame(np.random.randn(3, 3) * 10 ** 12) In [12]: df Out[12]: 0 1 2 0 1.757189e+12 -1.083016e+12 5.812695e+11 1 7.889034e+11 5.984651e+11 2.138096e+11 2 -8.291878e+11 1.034696e+12 8.640301e+08 In [13]: print(df.to_string(float_format='{:f}'.format)) 0 1 2 0 1757188536437.788086 -1083016404775.687134 581269533538.170288 1 788903446803.216797 598465111695.240601 213809584103.112457 2 -829187757358.493286 1034695767987.889160 864030095.691202
to_csv와 비슷하게 작동합니다.
df.to_csv('df.csv', float_format='{:f}'.format, encoding='utf-8')
-
==============================
2.옵션 및 설정
옵션 및 설정
데이터 프레임의 시각화를 위해 pandas.set_option
import pandas as pd #import pandas package # for visualisation fo the float data once we read the float data: pd.set_option('display.html.table_schema', True) # to can see the dataframe/table as a html pd.set_option('display.precision', 5) # setting up the precision point so can see the data how looks, here is 5 df = pd.DataFrame(np.random.randn(20,4)* 10 ** -12) # create random dataframe
df.dtypes # check datatype for columns [output]: 0 float64 1 float64 2 float64 3 float64 dtype: object
df # output of the dataframe [output]: 0 1 2 3 0 -2.01082e-12 1.25911e-12 1.05556e-12 -5.68623e-13 1 -6.87126e-13 1.91950e-12 5.25925e-13 3.72696e-13 2 -1.48068e-12 6.34885e-14 -1.72694e-12 1.72906e-12 3 -5.78192e-14 2.08755e-13 6.80525e-13 1.49018e-12 4 -9.52408e-13 1.61118e-13 2.09459e-13 2.10940e-13 5 -2.30242e-13 -1.41352e-13 2.32575e-12 -5.08936e-13 6 1.16233e-12 6.17744e-13 1.63237e-12 1.59142e-12 7 1.76679e-13 -1.65943e-12 2.18727e-12 -8.45242e-13 8 7.66469e-13 1.29017e-13 -1.61229e-13 -3.00188e-13 9 9.61518e-13 9.71320e-13 8.36845e-14 -6.46556e-13 10 -6.28390e-13 -1.17645e-12 -3.59564e-13 8.68497e-13 11 3.12497e-13 2.00065e-13 -1.10691e-12 -2.94455e-12 12 -1.08365e-14 5.36770e-13 1.60003e-12 9.19737e-13 13 -1.85586e-13 1.27034e-12 -1.04802e-12 -3.08296e-12 14 1.67438e-12 7.40403e-14 3.28035e-13 5.64615e-14 15 -5.31804e-13 -6.68421e-13 2.68096e-13 8.37085e-13 16 -6.25984e-13 1.81094e-13 -2.68336e-13 1.15757e-12 17 7.38247e-13 -1.76528e-12 -4.72171e-13 -3.04658e-13 18 -1.06099e-12 -1.31789e-12 -2.93676e-13 -2.40465e-13 19 1.38537e-12 9.18101e-13 5.96147e-13 -2.41401e-12
df.to_csv('estc.csv',sep=',', float_format='%.15f') # write with precision .15
,0,1,2,3 0,-0.000000000002011,0.000000000001259,0.000000000001056,-0.000000000000569 1,-0.000000000000687,0.000000000001919,0.000000000000526,0.000000000000373 2,-0.000000000001481,0.000000000000063,-0.000000000001727,0.000000000001729 3,-0.000000000000058,0.000000000000209,0.000000000000681,0.000000000001490 4,-0.000000000000952,0.000000000000161,0.000000000000209,0.000000000000211 5,-0.000000000000230,-0.000000000000141,0.000000000002326,-0.000000000000509 6,0.000000000001162,0.000000000000618,0.000000000001632,0.000000000001591 7,0.000000000000177,-0.000000000001659,0.000000000002187,-0.000000000000845 8,0.000000000000766,0.000000000000129,-0.000000000000161,-0.000000000000300 9,0.000000000000962,0.000000000000971,0.000000000000084,-0.000000000000647 10,-0.000000000000628,-0.000000000001176,-0.000000000000360,0.000000000000868 11,0.000000000000312,0.000000000000200,-0.000000000001107,-0.000000000002945 12,-0.000000000000011,0.000000000000537,0.000000000001600,0.000000000000920 13,-0.000000000000186,0.000000000001270,-0.000000000001048,-0.000000000003083 14,0.000000000001674,0.000000000000074,0.000000000000328,0.000000000000056 15,-0.000000000000532,-0.000000000000668,0.000000000000268,0.000000000000837 16,-0.000000000000626,0.000000000000181,-0.000000000000268,0.000000000001158 17,0.000000000000738,-0.000000000001765,-0.000000000000472,-0.000000000000305 18,-0.000000000001061,-0.000000000001318,-0.000000000000294,-0.000000000000240 19,0.000000000001385,0.000000000000918,0.000000000000596,-0.000000000002414
df.to_csv('estc.csv',sep=',', float_format='%f') # this will remove the extra zeros after the '.'
자세한 내용은 pandas.DataFrame.to_csv를 확인하십시오.
-
==============================
3.값을 형식화 된 문자열 (예 : csvfile csv.writier의 일부)로 사용하려면 목록을 만들기 전에 숫자를 형식화 할 수 있습니다.
값을 형식화 된 문자열 (예 : csvfile csv.writier의 일부)로 사용하려면 목록을 만들기 전에 숫자를 형식화 할 수 있습니다.
with open('results_actout_file','w',newline='') as csvfile: resultwriter = csv.writer(csvfile, delimiter=',') resultwriter.writerow(header_row_list) resultwriter.writerow(df['label'].apply(lambda x: '%.17f' % x).values.tolist())
from https://stackoverflow.com/questions/22995762/pandas-to-csv-suppress-scientific-notation-in-csv-file-when-writing-pandas-to-c by cc-by-sa and MIT license
'PYTHON' 카테고리의 다른 글
[PYTHON] 파이썬에서 하프 톤 이미지 (0) | 2018.11.28 |
---|---|
[PYTHON] Visual Studio Python 프로젝트에 CNTK virtualenv 추가 (0) | 2018.11.28 |
[PYTHON] 여러 개의 Tkinter 목록 상자를 함께 스크롤하기 (0) | 2018.11.28 |
[PYTHON] 파이 게임에서 모니터 해상도를 얻는 방법? (0) | 2018.11.28 |
[PYTHON] 이미지를 바이트 배열로 변환하는 Python 스크립트 (0) | 2018.11.28 |