R, Python 분석과 프로그래밍의 친구 (by R Friend)

'len()'에 해당되는 글 4건

[Python pandas] DataFrame, Series의 행, 열 개수 세기

Python 분석과 프로그래밍/Python 데이터 전처리 2019. 7. 3. 23:49

이번 포스팅에서는 Python pandas의 DataFrame, Series 의 행, 열의 개수를 세는 방법(how to count the number of rows and columns of pandas DataFrame and Series)을 소개하겠습니다.

간단한 것들이고, 이미 소개한 것들이긴 한데요, Stackoverflow에 깔끔하게 유형별로 정리한 표가 있어서 옮겨보았습니다.

Series 의 행 개수를 셀 때 s.size 와 같이 뒤에 () 가 없다는 것 조심해야 겠습니다.그리고 > count()는 Null 값이 아닌 행(count Non-null rows)만 세며,

> size() 는 Null 값인 행도 모두 포함해서 행(size of all rows)을 센다

는 것도 유념하면 좋겠습니다.

구분	pandas DataFrame (df)	pandas Series (s)
행 개수 세기 (row count)	len(df) df.shape[0] len(df.index)	len(s) s.size len(s.index)
열 개수 세기 (column count)	df.shape[1] len(df.columns)	N/A
Null 값이 아닌 행 개수 세기 (Non-null row count)	df.count()	s.count()
그룹 별 행 개수 세기 (Row count per group)	df.groupby(...).size()	s.groupby(...).size()
그룹 별 Null 값이 아닌 행 개수 세기 (Non-null row count per group)	df.groupby(...).count()	s.groupby(...).count()

간단한 예제를 아래에 소개합니다.

numpy와 pandas 라이브러리 불러오고, DataFrame과 Series 데이터셋 만들어보겠습니다.

import numpy as np
import pandas as pd

df = pd.DataFrame({'grp': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'val': [1, 2, np.nan, 4, np.nan, np.nan, 7, 8, 9]})

In [02]: df

Out[02]:

grp val

0 A 1.0

1 A 2.0

2 A NaN

3 B 4.0

4 B NaN

5 B NaN

6 C 7.0

7 C 8.0

8 C 9.0

s = pd.Series([1, 2, np.nan, 4, np.nan, np.nan, 7, 8, 9])
In [03]: s

Out[03]:

0 1.0

1 2.0

2 NaN

3 4.0

4 NaN

5 NaN

6 7.0

7 8.0

8 9.0

dtype: float64

구분

DataFrame (df)

Series (s)

행의 개수 세기

(Row count)

In [04]: len(df)

Out[04]: 9

In [05]: df.shape[0]

Out[05]: 9

In [06]: len(df.index)

Out[06]: 9

In [08]: len(s)

Out[08]: 9

In [09]: s.size

Out[09]: 9

In [10]: len(s.index)

Out[10]: 9

열의 개수 세기

(Column count)

In [11]: df.shape[1]

Out[11]: 2

In [12]: len(df.columns)

Out[12]: 2

N/A

Null 이 아닌 행의 개수 세기

(Non-null row count, ignore NaNs)

In [13]: df.count()

Out[13]:

grp 9

val 6

dtype: int64

In [14]: df['val'].count()

Out[14]: 6

In [15]: s.count()

Out[15]: 6

그룹 별 행의 개수 세기

(Row count per group)

In [16]: df.groupby('grp').size()

Out[16]:

grp

A 3

B 3

C 3

dtype: int64

In [17]: s.groupby(df.grp).size()

Out[17]:

grp

A 3

B 3

C 3

dtype: int64

그룹별 Null 이 아닌 행의 개수 세기

(Non-null row count per group)

In [18]: df.groupby('grp').count()

Out[18]:

val

grp

A 2

B 1

C 3

In [19]: s.groupby(df.grp).count()

Out[19]:

grp

A 2

B 1

C 3

dtype: int64

많은 도움이 되었기를 바랍니다.

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

[Python pandas] DataFrame을 정렬한 후에, 그룹별로 상위 N개 행 선택하기 (sort DataFrame by value and select top N rows by group) (2)	2019.07.13
[Python pandas] pivot_table() 할 때 DataError: No numeric types to aggregate 에러 대처방법 aggfunc='first' (0)	2019.07.11
[Python pandas] DataFrame의 문자열 칼럼을 분할하여 일부분으로 새로운 칼럼 만들기 (2)	2019.07.01
[Python] 텍스트 파일 읽어와서 숫자형 데이터 표준화하기 (reading csv or text file, standardizing or normalizing of numeric data) (0)	2019.05.21
[Python] 경로 및 폴더 생성/제거(directory and path management using os), 파일 복사 (file copy using shutil) (0)	2019.03.03

Posted by Rfriend

[Python] 파이썬 튜플 내장 함수 및 메소드 (Python Tuple built-in functions and methods)

Python 분석과 프로그래밍/Python 설치 및 기본 사용법 2017. 8. 27. 18:28

지난번 포스팅에서는 파이썬 자료형 중에서 튜플의 생성, 삭제, 인덱싱, 슬라이싱 및 기본 연산자들에 대해서 알아보았습니다.

이번 포스팅에서는 파이썬 튜플의 내장 함수(Tuple Built-in Functions)와 메소드(Tuple Methods)에 대해서 알아보겠습니다.

참고로, 튜플 내장함수는 리스트와 동일하며, 메소드는 리스트 대비 매우 적습니다. 왜냐하면 튜플(Tuple)은 개별 요소 변경이 불가능(Immutable) 하기 때문에 요소 추가, 튜플 확장, 요소 제거, 뒤집기, 정렬 등이 안되기 때문입니다.

이번 포스팅은 매우 쉽기도 하려니와, 아주 짧게 간단하게 끝나겠네요. ^^

[ 파이썬 튜플의 내장 함수 및 메소드 (Python Tuple built-in functions and methods) ]

1. 파이썬 튜플의 내장 함수 (Python Tuple built-in functions)

1-1. len(tuple) : 튜플의 전체 길이 (length)

# en() : Gives the total length of the tuple

>>> len((1, 2, 3))

1-2. max(tuple) : 튜플 안에 있는 요소값 중 최대값 반환 (문자는 알파벳 기준)

# max(): Returns item from the tuple with max value

>>> len((1, 2, 3))

>>> max((1, 2, 3, 4, 5))

>>> max(('a', 'b', 'c', 'd', 'e')) # As for character, in order of alphabet

'e'

튜플 안의 요소값들이 문자열과 숫자가 섞여 있을 경우 max() 메소드를 적용하면 TypeError 가 발생합니다.

# TypeError for max() method when 'str' and 'int' are mixed in a tuple

>>> max((1, 2, 3, 'a', 'b', 'c'))

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: '>' not supported between instances of 'str' and 'int'

1-3. min(tuple) : 튜플 안에 있는 요소값 중 최소값 반환 (문자는 알파벳 기준)

# min() : Returns item from the tuple with min value

>>> min((1, 2, 3, 4, 5))

>>> min(('a', 'b', 'c', 'd', 'e'))

'a'

1-4. tuple(seq) : 리스트를 튜플로 변환 (converting a list into tuple)

# tuple(seq) : Converts a list to tuple

>>> my_list = [1, 2, 'a', 'b']

>>> type(my_list)

>>>

>>> my_tup = tuple(my_list)

my_tup

>>> (1, 2, 'a', 'b')

>>> type(my_tup)

2. 파이썬 튜플의 메소드 (Python Tuple methods)

2-1. tuple.count() : 튜플 내 요소의 개수 세기

# tuple.count(obj.) : Returns the total number of obj. in tuple

>>> tup = (1, 2, 3, 4, 5, 2, 2)

>>> tup.count(2)

2-2. tuple.index(obj.) : 튜플 내 요소가 있는 위치 index 반환

만약 똑같은 값이 2개 이상 들어있는 경우 처음 요소 값이 나타나는 위치의 index 를 반환합니다.

# tuple.index(obj.) : Returns the index of a obj. in tuple

>>> tup = (1, 2, 3, 4, 5, 2, 2)

>>> tup.index(2)

다음번 포스팅에서는 사전(Dictionary) 자료형의 기본 활용 및 특징에 대해서 알아보겠습니다.

많은 도움이 되었기를 바랍니다.

이번 포스팅이 도움이 되었다면 아래의 '공감~'를 꾸욱 눌러주세요. ^^

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 설치 및 기본 사용법' 카테고리의 다른 글

[Python] 사전 자료형 내장함수 및 메소드 (Dictionary built-in functions and methods) (0)	2017.08.28
[Python] 사전 자료형 생성 및 기본 사용법 (Python Dictionary : basic operations, access, delete) (0)	2017.08.27
[Python] 파이썬 튜플 생성 및 기본 사용법 (Python Tuple : basic operations, indexing, slicing) (3)	2017.08.27
[Python] 리스트 내장 함수 및 메소드 (Python List Built-in functions and methods) (4)	2017.08.21
[Python] 파이썬 자료형 : 리스트 (List) 생성 및 기본 사용법 (0)	2017.08.20

Posted by Rfriend

[Python] 파이썬 튜플 생성 및 기본 사용법 (Python Tuple : basic operations, indexing, slicing)

Python 분석과 프로그래밍/Python 설치 및 기본 사용법 2017. 8. 27. 11:53

파이썬에는 다수의 데이터를 다룰 수 있는 자료형으로 '리스트(List)', '튜플(Tuple)', '사전(Dictionary)' 자료형이 있습니다.

지난번 포스팅에서는 '리스트(List)' 자료형에 대해서 알아보았으며, 이번 포스팅에서는 2회에 나누어서 '튜플(Tuple)' 자료형에 대해서 소개하겠습니다.

튜플(Tuple) 자료형은 리스트(List) 와 유사하면서도 큰 차이가 있어서 처음 파이썬 사용하는 분이라면 혼동스러울 수 있습니다.

튜플과 리스트는 다른 형태의 다수의 자료, 객체를 하나의 순서열(sequence)로 묶어서 자료를 관리할 수 있다는 공통점이 있습니다만, 튜플(Tuple)은 자료 변경이 불가능(A tuple is a sequence of immutable Python objects) 하다는 점이, 변경이 가능한 리스트와는 다른 결정적인 차이점입니다. (참고: 문자열(String)도 튜플처럼 변경이 불가능한 자료형임)

리스트는 꺽인 대괄호([ ], square brackets)로 양 옆을 싸서 생성하는 반면에, 튜플은 둥근 괄호( ( ), parentheses, round brackets)를 사용해서 만듭니다.

[ 파이썬의 5가지 자료형 (Python's 5 Data Types) ]

변경이 불가능한 자료형(Immutable Type)이 왜 필요할까 싶을텐데요, '소프트웨어 성능 향상'과 '프로그래머가 자기 코드를 신뢰할 수 있다'는 장점이 있습니다.

[ 변경이 불가능한 자료형이 왜 필요할까? ]

"변경이 불가능한 자료형은 변경 가능한 자료형에 비해 소프트웨어의 성능을 향상하는데 도움을 줍니다. 변경 가능한 자료형과는 달리 데이터를 할당할 공간의 내용이나 크기가 달라지지 않기 때문에 생성 과정이 간단하고, 데이터가 오염되지 않을 것이라는 보장이 있기 때문에 복사본을 만드는 대신 그냥 원본을 사용해도 되기 때문입니다.

사실 이런 성능보다도, 프로그래머가 자기 코드를 신뢰할 수 있다는 것이 변경이 불가능한 자료형의 가장 큰 장점입니다. 프로그래머가 수천~수만 줄의 코드를 작성하다보면 변경되지 않아야 할 데이터를 오염시키는 버그를 만들 가능성이 높습니다. 이런 실수를 몇 군데 해놓으면 어디에서 문제가 생겼는지를 찾아내기가 상당히 어렵습니다. 그래서 코드를 설계할 때부터 변경이 가능한 데이터와 그렇지 않은 데이터를 정리해서 코드에 반영하는 것이 필요합니다."

* 출처 : '뇌를 자극하는 파이썬 3', 박상현 지음, 한빛미디어

자, 이제 둥근 괄호( ( ), parentheses)나 혹은 콤마( ',' , comma)를 사용해서 튜플을 만들어볼까요?

1. 괄호와 콤마를 사용해서 튜플 만들기 (Creating tuple using parentheses or comma) : (obj, )

(1-1). 둥근 괄호 ('( )', parentheses, round brackets)를 사용해서 튜플 만들기

# tuples are encolsed within parentheses

>>> tuple_1 = ('abc', 123, 3.14, ['edf', 456], ('gh', 'st'))

>>> tuple_1

('abc', 123, 3.14, ['edf', 456], ('gh', 'st'))

>>> type(tuple_1)

(1-2) 괄호 없이 콤마를 사용해서 튜플 만들기 (Tuple packing)

# tuple is created by putting different comma-separated values (without parentheses)

>>> tuple_1_2 = 'abc', 123, 3.14, ['edf', 456], ('gh', 'st')

>>> tuple_1_2

('abc', 123, 3.14, ['edf', 456], ('gh', 'st'))

>>> type(tuple_1_2)

(1-3) 요소가 하나뿐인 튜플 만들기 => 콤마 포함 필요

# creating tuple with 1 element using parentheses and a comma

>>> tuple_1_element_with_comma = (123, )

>>> tuple_1_element_with_comma

(123,)

>>> type(tuple_1_element_with_comma)

요소가 하나뿐이 객체를 괄호로 싸기만 하고 뒤에 콤마를 포함하지 않는 경우, 튜플이 아니라 정수형(int.) 자료형으로 저장이 되므로 주의가 필요합니다.

# if you don't include a comma for a single value, it will be a int., not a tuple

>>> int_1_element_without_comma = (123) # without a comma

>>> int_1_element_without_comma

123

>>> type(int_1_element_without_comma)

2. 튜플 삭제(Deleting a Tuple): del tuple

튜플은 변경이 불가능하기 때문에 개별 요소(individual elements)를 제거하는 것은 불가능합니다 (리스트는 pop 메소드나 remove 메소드로 개별 요소 삭제 가능). 대신에 튜플을 통째로 삭제(entire tuple)하는 것은 del 선언문을 사용해서 가능합니다.

# Deleting an entire Tuple with 'del' statement

>>> tuple_1 = ('abc', 123, 3.14, ['edf', 456], ('gh', 'st'))

>>> tuple_1

('abc', 123, 3.14, ['edf', 456], ('gh', 'st'))

>>> del tuple_1

>>> tuple_1 # tuple_1 is removed with del statement above

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

NameError: name 'tuple_1' is not defined

3. 튜플 개별 요소를 변경하려면 TypeError 발생

>>> tup_1 = (1, 2, 3)

>>> tup_1[0] = 4

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

TypeError: 'tuple' object does not support item assignment

4. 튜플 인덱싱, 슬라이싱 (Tuple indexing, slicing) : tup[0], tup[0:3], tup[-1], tup[-3:-1]

튜플도 리스트나 문자열과 같이 순서열 자료형(sequences) 이기 때문에 대괄호([ ], square brackets)를 사용해서 특정 요소를 인덱싱(indexing)하거나 구간 범위의 요소들을 슬라이싱(slicing) 하는 것이 가능합니다. 인덱싱은 '0'부터 시작합니다.

tup[0]은 정수를, tup[0:1]은 튜플을 반환함을 주의하세요.

# Tuple indexing and slicing start at zero

>>> tuple_2 = (1, 2, 3, 4, 5, 6, 7)

>>> tuple_2

(1, 2, 3, 4, 5, 6, 7)

>>> tuple_2[0] # Indexing one element of tuple => int.

>>> tuple_2[0:1] # Indexing one element of tuple => tutple

(1,)

>>> tuple_2[0:3] # Tuple slicing

(1, 2, 3)

>>> tuple_2[3:] # Tuple slicing

(4, 5, 6, 7)

튜플 인덱싱이나 슬라이싱을 할 때 '-' (negative) 부호가 붙으면 오른쪽에서 부터 시작하며, 이때 제일 오른쪽이 '-1'입니다.

# In case of negative(-), it counts from the rigth

>>> tuple_2 = (1, 2, 3, 4, 5, 6, 7)

>>> tuple_2

(1, 2, 3, 4, 5, 6, 7)

>>> tuple_2[-1] # indexing from the right

>>> tuple_2[-1:]

(7,)

>>> tuple_2[-3:-1] # slicing from the right

(5, 6)

>>> tuple_2[-3:]

(5, 6, 7)

5. 여러 개 데이터를 튜플로 묶기(Tuple Packing)

<--> 튜플의 각 요소를 여러 개 변수에 할당하기(Tuple Unpacking)

# Tuple packing

>>> tup_packing = 'Mr.Lee', 25, 'Seoul', 'KOREA'

>>> tup_packing

('Mr.Lee', 25, 'Seoul', 'KOREA')

# Tuple unpacking

>>> name, age, city, nationality = tup_packing

>>> name

'Mr.Lee'

>>> age

>>> city

'Seoul'

>>> nationality

'KOREA'

튜플의 각 요소를 여러 개의 변수에 할당(Tuple unpacking)할 때 튜플 내 요소의 개수와 할당하려는 변수의 개수가 서로 같지 않다면 ValueError 가 발생합니다.

# if the number of unpack vaule and variable is not the same, then ValueError occurs

>>> tup_packing # 4 elements

('Mr.Lee', 25, 'Seoul', 'KOREA')

>>> name, age, city = tup_packing # 3 variables

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ValueError: too many values to unpack (expected 3)

6. 튜플의 기본 연산자 (Basic Tuples Operations)

튜플의 기본 연산자에는 (리스트와 동일하게) 길이를 세는 len() 함수, 튜플을 합치는 '+' 연산자, 튜플 내 요소값을 반복하는 '*' 연산자, 튜플 내 요소값이 존재하는지 여부를 블리언값으로 반환하는 'in' 연산자, 그리고 for loop 반복 연산자가 있습니다.

설명 (Description)	파이썬 코드 (Python Code)	결과 (Results)
튜플 길이 (Length)	len((1, 2, 3))	3
튜플 합치기 (Concatenation)	(1, 2, 3) + ('a', 'b', 'c')	(1, 2, 3, 'a', 'b', 'c')
반복 (Repetition)	(1, 'a')*3	(1, 'a', 1, 'a', 1, 'a')
소속 여부 (Membership)	3 in (1, 2, 3) 4 in (1, 2, 3)	True False
for loop 반복 (iteration)	for x in (1, 2, 3): print(x)	1 2 3

다음번 포스팅에서는 튜플의 내장 함수(Tuple built-in functions)와 튜플 메소드(Tuple methods)에 대해서 알아보겠습니다.

많은 도움이 되었기를 바랍니다.

이번 포스팅이 도움이 되었다면 아래의 '공감~'를 꾸욱 눌러주세요. ^^

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 설치 및 기본 사용법' 카테고리의 다른 글

[Python] 사전 자료형 생성 및 기본 사용법 (Python Dictionary : basic operations, access, delete) (0)	2017.08.27
[Python] 파이썬 튜플 내장 함수 및 메소드 (Python Tuple built-in functions and methods) (0)	2017.08.27
[Python] 리스트 내장 함수 및 메소드 (Python List Built-in functions and methods) (4)	2017.08.21
[Python] 파이썬 자료형 : 리스트 (List) 생성 및 기본 사용법 (0)	2017.08.20
[Python] Format을 갖춘 문자열 만들기 : String Formatting Operator %, Method format() (0)	2017.08.12

Posted by Rfriend

[Python] 파이썬 문자열 처리를 위한 다양한 메소드 (Python string methods)

Python 분석과 프로그래밍/Python 설치 및 기본 사용법 2017. 8. 4. 13:52

지난번 포스팅에서는 파이썬의 자료 유형인

- 숫자(Number),

- 문자열(String),

- 리스트(List),

- 튜플(Tuple),

- 사전(Dictionary)

중에서 먼저 숫자(Number)와 문자열(String)의 기본 사용법을 소개하였습니다.

이번 포스팅에서는 지난 포스팅에 이어서 문자열에 특화되어 문자열 자료형을 다양하게 처리할 수 있는 함수인 문자열 메소드(String Methods) 에 대해서 알아보겠습니다.

문자열 메소드를 숙지하고 있으면 동일한 목적의 문자열 전처리를 위해 직접 프로그래밍을 하는 것보다 문자열 메소드를 사용한 1~2줄의 코드면 해결되므로 업무 효율도 오르고, for loop 문을 쓰는 것보다 속도도 훨씬 빠릅니다.

[참고] 메소드 (Method)

내장 함수(Built-in Function)와는 달리 문자열 자료형과 같이 특정 자료형이 가지고 있는 함수를 메소드(Method) 라고 합니다. 메소드는 객체 지향 프로그래밍의 기능에 대응하는 파이썬 용어입니다. 함수와 거의 동일한 의미이지만 메소드는 클래스의 멤버라는 점이 다릅니다.

평소에 공부해놓고 '아, 문자열 메소드에 이런 기능이 있었지!' 정도는 기억해놓고 있어야 바로 찾아서 쓰기 쉽겠지요? 아래에 문자열 메소드들을 기능에 따라서 그룹핑을 해보았는데요, 저의 경우 len(), find(), lower(), upper(), lstrip(), rstrip(), split(), splitlines(), replace(), join(), zfill() 등을 종종 사용하는 편이네요.

[ 파이썬 문자열 메소드 (String Methods in Python) ]

이번 포스팅은 https://www.tutorialspoint.com/python/python_strings.htm 사이트에 있는 영문 소개자료를 참고하여 작성하였습니다. 문자열 메소드의 기능 설명만 되어 있어서 좀더 이해하기 쉽도록 예제를 추가로 만들어보았습니다.

expandtabs(), maketrans() 등 일부 메소드는 제가 써본적도 없고 앞으로 거의 쓸 일이 없을 것 같아서 주관적으로 판단해서 몇 개 빼고 소개하는 것도 있습니다.

하나씩 예을 들어 살펴보겠습니다.

1. 문자열 계산 관련 메소드 (String methods based on calculation)

len() : 문자열 길이

# len() : Returns the length of the string

>>> a = 'I Love Python'

>>> len(a)

min(), max() : 문자열 내 문자, 혹은 숫자의 최소값, 최대값 (알파벳 순서, 숫자 순서 기반)

# max(str), min(str) : Returns the max, min alphabetical character from the string str

>>> d = 'abc'

>>> f = '123'

>>>

>>> min(d)

'a'

>>> max(d)

'c'

>>>

>>> min(f)

'1'

>>> max(f)

'3'

count() : 문자열 안에서 매개변수로 입력한 문자열이 몇 개 들어있는지 개수를 셈

(begin, end 위치 설정 가능)

# count() : Counts how many times str occurs in string

>>> a = 'I Love Python'

>>> a.count('o')

>>>

>>> a = 'I Love Python'

>>> a.count('o', 7, len(a)) # count(string, begin, end)

>>>

>>> a.count('k') # there is no 'k' character in 'a' string

2. 문자열에 특정 문자 들어있는지 여부, 어디에 위치하고 있는지 찾아주는 메소드

startswith() : 문자열이 매개변수로 입력한 문자열로 시작하면 True, 그렇지 않으면 False 반환

# startswith(): Determines if string or a substring of string

>>> a = 'I Love Python'

>>> a.startswith('I')

True

>>> a.startswith('I Lo')

True

>>> a.startswith('U')

False

endswith() : 문자열이 매개변수로 입력한 문자열로 끝나면 True, 그렇지 않으면 False 반환

# endswith(): Determines if string or a substring of string

>>> a = 'I Love Python'

>>> a.endswith('Python')

True

>>> a.endswith('Pycham')

False

find() : 문자열에 매개변수로 입력한 문자열이 있는지를 앞에서 부터 찾아 index 반환, 없으면 '-1' 반환

# find() : Search forwards, Determine if str occurs in string and return the index

>>> a = 'I Love Python'

>>> a.find('o')

>>> a.find('k') # if there is no string, then '-1'

-1

rfind() : 문자열에 매개변수로 입력한 문자열이 있는지를 뒤에서 부터 찾아 index 반환, 없으면 '-1' 반환

# rfind() : Same as find(), but search backwards in string

>>> a = 'I Love Python'

>>> a.rfind('o')

index() : find()와 기능 동일하나, 매개변수로 입력한 문자열이 없으면 ValueError 발생

# index(): Same as find(), but raises an exception if str not found

>>> a = 'I Love Python'

>>> a.index('o')

>>> a.index('k') # ValueError: substring not found

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ValueError: substring not found

rindex() : index()와 기능 동일하나, 뒤에서 부터 매개변수의 문자열이 있는지를 찾음

# rindex(): Same as index(), but search backwards in string

>>> a = 'I Love Python'

>>> a.rindex('o')

>>> a.rindex('k') # ValueError: substring not found

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

ValueError: substring not found

3. 숫자, 문자 포함 여부 확인하는 메소드

isalnum() : 문자열이 알파벳과 숫자로만 이루어졌으면 True, 그렇지 않으면 False

>>> a = 'I Love Python'

>>> d = 'abc'

>>> e = '123abc'

>>> f = '123'

>>>

# isalnum() : Returns true if string has at least 1 character and

# all characters are alphanumeric and false otherwise

>>> a.isalnum() # False

False

>>> d.isalnum() # True

True

>>> e.isalnum() # True

True

>>> f.isalnum() # True

True

isalpha() : 문자열이 알파벳(영어, 한글 등)으로만 이루어졌으면 True, 그렇지 않으면 False

>>> a = 'I Love Python'

>>> d = 'abc'

>>> e = '123abc'

>>> f = '123'

>>>

# isalpha() : Returns true if string has at least 1 character

# and all characters are alphabetic and false otherwise

>>> a.isalpha() # False (there is space between characters)

False

>>> d.isalpha() # True

True

>>> e.isalpha() # False

False

>>> f.isalpha() # False

False

isdigit() : 문자열이 숫자만 포함하고 있으면 True, 그렇지 않으면 False, isnumeric()과 동일

>>> a = 'I Love Python'

>>> d = 'abc'

>>> e = '123abc'

>>> f = '123'

>>>

# isdigit() : Returns true if string contains only digits and false otherwise

>>> a.isdigit() # False

False

>>> d.isdigit() # False

False

>>> e.isdigit() # False

False

>>> f.isdigit() # True

True

isnumeric() : 문자열이 숫자로만 이루어져 있으면 True, 그렇지 않으면 False, isdigit()과 동일

>>> a = 'I Love Python'

>>> d = 'abc'

>>> e = '123abc'

>>> f = '123'

>>>

# isnumeric(): Returns true if a unicode string contains only numeric characters

>>> a.isnumeric() # False

False

>>> d.isnumeric() # False

False

>>> e.isnumeric() # False

False

>>> f.isnumeric() # True

True

isdecimal() : 문자열이 10진수 문자이면 True, 그렇지 않으면 False

>> a = 'I Love Python'

>>> d = 'abc'

>>> e = '123abc'

>>> f = '123'

>>>

# isdecimal(): Returns true if a unicode string contains only decimal characters

>>> a.isdecimal() # False

False

>>> d.isdecimal() # False

False

>>> e.isdecimal() # False

False

>>> f.isdecimal() # True

True

4. 대문자, 소문자 여부 확인하고 변환해주는 문자열 메소드

islower() : 문자열이 모두 소문자로만 되어있으면 True, 그렇지 않으면 False

>>> a = 'I Love Python'

>>> g = 'i love python'

>>> h = 'I LOVE PYTHON'

>>>

# islower(): Returns true if string has at least 1 cased character

# and all cased characters are in lowercase and false otherwise

>>> a.islower() # False

False

>>> g.islower() # True

True

>>> h.islower() # False

False

isupper() : 문자열이 모두 대문자로만 되어있으면 True, 그렇지 않으면 False

>>> a = 'I Love Python'

>>> g = 'i love python'

>>> h = 'I LOVE PYTHON'

>>>

# isupper(): Returns true if string has at least one cased character

# and all cased characters are in uppercase and false otherwise

>>> a.isupper() # False

False

>>> g.isupper() # False

False

>>> h.isupper() # True

True

lower() : 문자열 내 모든 대문자를 모두 소문자(a lowercase letter)로 변환

>>> a = 'I Love Python'

>>>

# lower(): Converts all uppercase letters in string to lowercase

>>> a.lower()

'i love python'

upper() : 문자열 내 모든 소문자를 모두 대문자(a uppercase letter)로 변환

>>> a = 'I Love Python'

>>>

# upper(): Converts lowercase letters in string to uppercase

>>> a.upper()

'I LOVE PYTHON'

swapcase() : 문자열 내 소문자는 대문자로 변환, 대문자는 소문자로 변환

# swapcase(): Inverts case for all letters in string

>>> a = 'I Love Python'

>>> a.swapcase() # 'i lOVE pYTHON'

'i lOVE pYTHON'

>>>

>>> g = 'i love python'

>>> g.swapcase() # 'I LOVE PYTHON' (same as upper())

'I LOVE PYTHON'

>>>

>>> h = 'I LOVE PYTHON'

>>> h.swapcase() # 'i love python' (same as lower())

'i love python'

istitle() : 문자열이 제목 형식에 맞게 대문자로 시작하고 이후는 소문자이면 True, 그렇지 않으면 False

>>> a = 'I Love Python'

>>> g = 'i love python'

>>> h = 'I LOVE PYTHON'

>>>

# istitle(): Returns true if string is properly "titlecased" and false otherwise

>>> a.istitle() # True

True

>>> g.istitle() # False

False

>>> h.istitle() # False

False

title() : 문자열을 제목 형식(titlecased)에 맞게 시작은 대문자로, 나머지는 소문자로 변환

>>> g = 'i love python'

>>> h = 'I LOVE PYTHON'

>>>

# title(): Returns "titlecased" version of string, that is,

# all words begin with uppercase and the rest are lowercase

>>> g.title() # 'I Love Python'

'I Love Python'

>>> h.title() # 'I Love Python'

'I Love Python'

capitalize)=() : 문자열 내 첫번째 문자를 대문자로 변환하고, 나머지는 모두 소문자로 변환

>>> a = 'I Love Python'

>>> g = 'i love python'

>>> h = 'I LOVE PYTHON'

>>>

# capitalize(): Capitalizes first letter of string

>>> a.capitalize() # 'I love python'

'I love python'

>>> g.capitalize() # 'I love python'

'I love python'

>>> h.capitalize() # 'I love python'

'I love python'

5. 공백 존재 여부 확인 및 처리하기 문자열 메소드

lstrip() : 문자열의 왼쪽에 있는 공백을 제거

# lstrip() : Removes all leading whitespace in string

>>> b = ' I Love Python'

>>> b.lstrip()

'I Love Python'

rstrip() : 문자열의 오른쪽에 있는 공백을 제거

# rstrip() : Removes all trailing whitespace of string

>>> c = 'I Love Python '

>>> c.rstrip()

'I Love Python'

strip() : 문자열의 양쪽에 있는 공백을 제거

# strip() : Performs both lstrip() and rstrip() on string

>>> ' I Love Python '.strip()

'I Love Python'

isspace() : 문자열이 단지 공백(whitespace)으로만 되어있을 경우 True, 그렇지 않으면 False

# isspace(): Returns true if string contains only whitespace characters and false otherwise

>>> i = ' '

>>> j = ' I Love Python'

>>>

>>> i.isspace() # True

True

>>> j.isspace() # False

False

center(width) : 총 길이가 매개변수로 받는 문자열폭(width)만큼 되도록 공백을 추가하여 중앙 정렬

# center(): Returns a space-padded string with the original string

# centered to a total of width columns

>>> a = 'I Love Python'

>>> a.center(21)

' I Love Python '

6. 문자열을 나누고, 붙이고, 교체하고, 채우는 문자열 메소드 (split, join, replace, fill)

split() : 문자열을 구분자(delimiter, separator) 기준에 따라 나누기

split()은 상당히 자주 사용하는 문자열 메소드 입니다.

# split(): Splits string according to delimiter str (space if not provided)

# and returns list of substrings; split into at most num substrings if given

>>> x = 'haha, hoho, hihi'

>>> x.split(sep=',') # as a list ['haha', ' hoho', ' hihi']

['haha', ' hoho', ' hihi']

>>> ha, ho, hi = x.split(sep=',')

>>> ha

'haha'

>>> ho

' hoho'

>>> hi

' hihi'

>>> a = 'I Love Python'

>>> a.split(' ') # without arg 'sep='

['I', 'Love', 'Python']

>>> a.split() # default delimiter is space if not provided

['I', 'Love', 'Python']

splitlines() : 여러개의 줄로 이루어진 문자열을 줄 별로 구분하여 리스트 생성

# splitlines(): returns a list with all the lines in string,

# optionally including the line breaks (if num is supplied and is true)

>>> y = 'haha, \nhoho, \nhihi'

>>> y

'haha, \nhoho, \nhihi'

>>> y.splitlines() # ['haha, ', 'hoho, ', 'hihi']

['haha, ', 'hoho, ', 'hihi']

replace(old, new, max) : old 문자열을 new 문자열로 교체. 단, max 매개변수 있으면, max 개수 만큼만 교체하고 이후는 무시

# replace(old, new): Replaces all occurrences of old in string with new

# or at most max occurrences if max given

>>> a = 'I Love Python'

>>> a.replace('Python', 'R')

'I Love R'

>>>

>>> a_2 = 'I Love Python, Python, Python, Python, Python~!!!'

>>> a_2.replace('Python', 'R', 3) # str.replace(old, new, max)

'I Love R, R, R, Python, Python~!!!'

join() : 여러개의 문자열을 구분자(separator) 문자열을 사이에 추가하여 붙이기

join()은 꽤 자주 쓰는 문자열 메소드 중의 하나입니다.

# join(): Merges (concatenates) the string representations of elements

# in sequence seq into a string, with separator string

>>> mylist = ['I', 'Love', 'Python']

>>> print(mylist)

['I', 'Love', 'Python']

>>>

>>> mystring = '_'.join(mylist)

>>> print(mystring) # 'I_Love_Python'

I_Love_Python

# To concatenate item in list to strings with join() method

>>> mylist_num = [1, 2, 3, 4, 5]

>>> print(mylist_num) # [1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]

>>>

>>> mylist_str = ''.join(map(str, mylist_num))

>>> print(mylist_str) # 12345

12345

>>>

>>> '_'.join(map(str, mylist_num)) # '1_2_3_4_5'

'1_2_3_4_5'

zfill(width) : 문자열을 매개변수 width만큼 길이로 만들되, 추가로 필요한 자리수만큼 '0'을 채움

# zfill(width): Returns original string leftpadded with zeros to a total of width characters;

# intended for numbers, zfill() retains any sign given (less one zero)

>>> f = '123'

>>> f.zfill(10)

'0000000123'

ljust(width[, fillchar]) : 문자열을 매개변수 width만큼 길이로 만들되, 왼쪽은 원본 문자열로 채우고,

오른쪽에 추가로 필요한 자리수만큼 매개변수 fillchar 문자열로 채움

# ljust(): Returns a space-padded string with the original string left-justified

# to a total of width columns

# str.ljust(width[, fillchar])

>>> a = 'I Love Python'

>>> a.ljust(20, 'R')

'I Love PythonRRRRRRR'

rjust(width[, fillchar]) : 문자열을 매개변수 width만큼 길이로 만들되, 오른쪽은 원본 문자열로 채우고,

왼쪽에 추가로 필요한 자리수만큼 매개변수 fillchar 문자열로 채움

# rjust(): Returns a space-padded string with the original string right-justified

# to a total of width columns

>>> a.rjust(20, 'R')

'RRRRRRRI Love Python'

>>> a.rjust(20, ' ')

' I Love Python'

다음번 포스팅에서는 문자열의 포맷 메소드(string formatting opertor)에 대해서 알아보겠습니다.

많은 도움 되었기를 바랍니다.

이번 포스팅이 도움이 되었다면 아래의 '공감~'를 꾸욱 눌러주세요. ^^

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 설치 및 기본 사용법' 카테고리의 다른 글

[Python] 파이썬 자료형 : 리스트 (List) 생성 및 기본 사용법 (0)	2017.08.20
[Python] Format을 갖춘 문자열 만들기 : String Formatting Operator %, Method format() (0)	2017.08.12
[Python] 파이썬 자료 유형 (Python data types) : 수(number), 문자열(string) (0)	2017.07.01
[Python] 파이썬 기본 구문법 (Basic Syntax) (0)	2017.04.23
[Python] 패키지 설치 시 “connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed” 에러 발생 대처 방법, HTTPS, SSL 보안 때문에 파이썬 패키지 설치 안될 때 대처방법 (0)	2017.04.22

Posted by Rfriend

이전 1 다음

R, Python 분석과 프로그래밍의 친구 (by R Friend)

'len()'에 해당되는 글 4건

[Python pandas] DataFrame, Series의 행, 열 개수 세기

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

[Python] 파이썬 튜플 내장 함수 및 메소드 (Python Tuple built-in functions and methods)

'Python 분석과 프로그래밍 > Python 설치 및 기본 사용법' 카테고리의 다른 글

[Python] 파이썬 튜플 생성 및 기본 사용법 (Python Tuple : basic operations, indexing, slicing)

'Python 분석과 프로그래밍 > Python 설치 및 기본 사용법' 카테고리의 다른 글

[Python] 파이썬 문자열 처리를 위한 다양한 메소드 (Python string methods)

'Python 분석과 프로그래밍 > Python 설치 및 기본 사용법' 카테고리의 다른 글

카테고리

태그목록

티스토리툴바