[Python pandas] DataFrame과 Series 합치기 : pd.concat(), append()

지난번 포스팅에서는 DataFrame을 Python pandas 라이브러리의 pd.concat() 함수를 사용해서 상+하로 합치기, 좌+우로 합치기를 해보았습니다.

이번 포스팅에서는 이어서 DataFrame과 Series를 pd.concat() 함수, append() 함수를 사용해서 합치기를 소개하겠습니다.

DataFrame 끼리 합치기 대비 해서 DataFrame + Series 가 index 관련해서 좀 헷갈리는게 있습니다만, 아래의 간단한 예시를 참고하면 어렵지 않게 이해할 수 있을 것입니다.

pandas, DataFrame, Series importing 부터 시작해 보시죠.

# importing libraries

In [1]: import pandas as pd

...: from pandas import DataFrame

...: from pandas import Series

(1) DataFrame에 Series '좌+우'로 합치기 : pd.concat([df, Series], axis=1)

DataFrame과 Series가 합쳐지면 DataFrame이 됩니다. axis=1 을 설정하면 '좌+우' 형태로 열(column)이 오른쪽 옆으로 늘어납니다.

새로 합쳐지는 DataFrame의 열 이름(column name)을 유심히 살펴보세요. Series의 이름(name)이 새로운 DataFrame의 변수 이름이 됩니다.

In [2]: df_1 = pd.DataFrame({'A': ['A0', 'A1', 'A2'],

...: 'B': ['B0', 'B1', 'B2'],

...: 'C': ['C0', 'C1', 'C2'],

...: 'D': ['D0', 'D1', 'D2']},

...: index=[0, 1, 2])

In [3]: df_1

Out[3]:

A B C D
0 A0 B0 C0 D0
1 A1 B1 C1 D1
2 A2 B2 C2 D2

In [4]: Series_1 = pd.Series(['S1', 'S2', 'S3'], name='S')

In [5]: Series_1

Out[5]:

0    S1
1    S2
2    S3
Name: S, dtype: object

# Concatenating DataFrame and Series along columns (from left to right)

# concatenated column name of the new DataFrame will be the same name of Series

In [6]: pd.concat([df_1, Series_1], axis=1)

Out[6]:

A B C D S
0 A0 B0 C0 D0 S1
1 A1 B1 C1 D1 S2
2 A2 B2 C2 D2 S3

(2) DataFrame에 Series를 '좌+우'로 합칠 때

열 이름(column name) 무시하고 정수 번호 자동 부여 : ignore_index=True

In [7]: pd.concat([df_1, Series_1], axis=1, ignore_index=True)

Out[7]:

0 1 2 3 4
0 A0 B0 C0 D0 S1
1 A1 B1 C1 D1 S2
2 A2 B2 C2 D2 S3

(3) Series 끼리 '좌+우'로 합치기 : pd.concat([Series1, Series2, ...], axis=1)

만약 Series의 이름(name)이 있으면 합쳐진 DataFrame의 열 이름(column name)으로 사용됩니다. Series에 이름이 없다면 정수 0, 1, 2, ... 가 자동 부여 됩니다.

In [8]: Series_1 = pd.Series(['S1', 'S2', 'S3'], name='S')

In [9]: Series_2 = pd.Series([0, 1, 2]) # without name

In [10]: Series_3 = pd.Series([3, 4, 5]) # without name

In [11]: Series_1

Out[11]:

0    S1
1    S2
2    S3

Name: S, dtype: object

In [12]: Series_2

Out[12]:

0    0
1    1
2    2
dtype: int64

In [13]: Series_3

Out[13]:

0    3
1    4
2    5
dtype: int64

# name of Series will be used as the column name of concatenated DataFrame

In [14]: pd.concat([Series_1, Series_2, Series_3], axis=1)

Out[14]:

S 0 1
0 S1 0 3
1 S2 1 4
2 S3 2 5

(4) Series 끼리 합칠 때 열 이름(column name) 덮어 쓰기 : keys = ['xx', 'xx', ...]

In [15]: pd.concat([Series_1, Series_2, Series_3], axis=1, keys=['C0', 'C1', 'C1'])

Out[15]:

   C0 C1 C1
0 S1   0   3
1 S2   1   4
2 S3   2   5

(5) DataFrame에 Series를 '위+아래'로 합치기 : df.append(Series, ignore_index=True)

ignore_index=True 를 설정해주도록 합니다.

In [16]: df_1

Out[16]:

A B C D
0 A0 B0 C0 D0
1 A1 B1 C1 D1
2 A2 B2 C2 D2

In [17]: Series_4 = pd.Series(['S1', 'S2', 'S3', 'S4'], index=['A', 'B', 'C', 'E'])

In [18]: Series_4

Out[18]:

A    S1
B    S2
C    S3
E    S4
dtype: object

In [19]: df_1.append(Series_4, ignore_index=True)

Out[19]:

    A    B   C    D    E
0 A0 B0 C0   D0 NaN
1 A1 B1 C1   D1 NaN
2 A2 B2 C2   D2 NaN
3 S1 S2 S3 NaN   S4

ignore_index=True 를 설정해주지 않으면 아래처럼 'TypeError' 가 발생합니다.

In [20]: df_1.append(Series_4) # TypeError without 'ignore_index=True'

Traceback (most recent call last):

File "<ipython-input-20-ca24d6ef8563>", line 1, in <module>

df_1.append(Series_4) # TypeError without 'ignore_index=True'

File "C:\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4314, in append

raise TypeError('Can only append a Series if ignore_index=True'

TypeError: Can only append a Series if ignore_index=True or if the Series has a name

이번 포스팅이 도움이 되었다면 아래의 '공감~♡'를 꾹 눌러주세요. ^^

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

[Python pandas] DataFrame을 index 기준으로 합치기 (merge, join on index) (3)	2016.12.06
[Python pandas] Database처럼 DataFrame Join/Merge 하기 : pd.merge() (0)	2016.12.03
[Python pandas] 여러개의 동일한 형태 DataFrame 합치기 : pd.concat() (2)	2016.11.28
[Python pandas] DataFrame의 index 재설정(reindex) 와 결측값 채우기(fill in missing values) (4)	2016.11.27
[Python pandas] DataFrame의 행 또는 열 데이터 선택해서 가져오기 (DataFrame objects indexing and selection) (2)	2016.11.27

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

R, Python 분석과 프로그래밍의 친구 (by R Friend)

[Python pandas] DataFrame과 Series 합치기 : pd.concat(), append()

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

카테고리

태그목록

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역