'결측값이 포함된 배열의 곱 함수' 태그의 글 목록

'결측값이 포함된 배열의 곱 함수'에 해당되는 글 1건

2017.03.09 [Python NumPy] 범용 함수 (universal functions) : (1-2) 단일 배열 unary ufuncs : 합(sum), 누적합(cum_sum), 곱(product), 누적곱(cum_prod), 차분(difference), gradient 범용함수 2

[Python NumPy] 범용 함수 (universal functions) : (1-2) 단일 배열 unary ufuncs : 합(sum), 누적합(cum_sum), 곱(product), 누적곱(cum_prod), 차분(difference), gradient 범용함수

Python 분석과 프로그래밍/Python 데이터 전처리 2017. 3. 9. 23:58

지난번 포스팅에서는 NumPy 에서 제공하는

- (1) 단일 배열 대상의 범용 함수

(Unary Universal Functions)

- (2) 2개 배열 대상의 범용 함수

(Binary Universal Functions)

에 대해서 간략히 소개를 하였습니다.

그리고 단일 배열 대상의 범용 함수 중에서 (1-1) '올림 혹은 내림 (Rounding)' 함수들(np.around, np.round_, np.rint, np.fix, np.ceil, np.floor, np.trunc)에 대해서 알아보았습니다.

이번 포스팅에서는 단일 배열 대상의 범용 함수(Unary ufuncs) 중에서 (1-2) 배열 원소 간 곱(products), 합(sums), 차분(differences) 범용 함수들에 대해서 알아보겠습니다. (함수가 많아서 한꺼번에 포스팅하기에 버거우므로 여러번 나누어서 포스팅합니다)

[ Unary ufuncs : 배열 원소 간 곱 (products), 합 (sums), 차분 (differences) ]

(1-2) 배열 원소 간 곱(products), 합(sums), 차분(differences), 기울기(gradient) 범용함수

1차원 배열 b와 2차원 배열 c를 가지고 예를 들어 설명하겠습니다.

In [1]: import numpy as np

In [2]: b = np.array([1, 2, 3, 4]) # 1 dimension

In [3]: b

Out[3]: array([1, 2, 3, 4])

In [4]: c = np.array([[1, 2], [3, 4]]) # 2 dimension

In [5]: c

Out[5]:

array([[1, 2],
[3, 4]])

(1-2-1) 배열 원소 간 곱 범용 함수 (products universal funcstions) : np.prod()

2차원 배열의 경우 axis=0 이면 같은 열(column)의 위*아래 방향으로 배열 원소 간 곱하며, axis=1 이면 같은 행(row)의 왼쪽*오른쪽 원소 간 곱을 합니다.

# (1-2-1) np.prod() : Return the product of array elements over a given axis

# 1 dimensional array

In [3]: b

Out[3]: array([1, 2, 3, 4])

In [6]: np.prod(b) # 1*2*3*4

Out[6]: 24

# 2 dimensional array

In [5]: c

Out[5]:

array([[1, 2],
[3, 4]])

In [7]: np.prod(c, axis=0) # [1*3, 2*4] ↓

Out[7]: array([3, 8])

In [8]: np.prod(c, axis=1) # [1*2, 3*4] →

Out[8]: array([ 2, 12])

(1-2-2) 배열 원소 간 합치기 범용 함수 (sum universal functions) : np.sum()

keepdims=True 옵션을 설정하면 1 차원 배열로 배열 원소 간 합을 반환합니다.

# (1-2-2) np.sum() : Sum of array elements over a given axis

# 1 dimensional array

In [3]: b

Out[3]: array([1, 2, 3, 4])

In [9]: np.sum(b) # [1+2+3+4]

Out[9]: 10

# the axes which are reduced are left in the result as dimensions with size one

In [10]: np.sum(b, keepdims=True)

Out[10]: array([10]) # 1 dimension array

In [11]: np.sum(b, keepdims=True).shape # 1 dimension array

Out[11]: (1,)

2차원 배열의 경우 axis=0 을 설정하면 같은 열(column)의 위+아래 원소 값을 더하며, axis=1 을 설정하면 같은 행(row)의 왼쪽+오른쪽 원소 값을 더하여 1차원 배열을 반환합니다.

# 2 dimensional array

In [5]: c

Out[5]:

array([[1, 2],
[3, 4]])

In [12]: np.sum(c, axis=0) # [1+3, 2+4] ↓

Out[12]: array([4, 6])

In [13]: np.sum(c, axis=1) # [1+2, 3+4] →

Out[13]: array([3, 7])

(1-2-3) NaN 이 포함된 배열 원소 간 곱하기 범용 함수 : np.nanprod()

np.nanprod() 함수는 NaN (Not a Numbers) 을 '1'(one)로 간주하고 배열 원소 간 곱을 합니다.

# (1-2-3) np.nanprod() : Return the product of array elements

# over a given axis treating Not a Numbers (NaNs) as ones

In [14]: d = np.array([[1, 2], [3, np.nan]])

In [15]: d

Out[15]:

array([[ 1., 2.],
[ 3., nan]])

In [16]: np.nanprod(d, axis=0) # [1*3, 2*1] ↓

Out[16]: array([ 3., 2.])

In [17]: np.nanprod(d, axis=1) # [1*2, 3*1] →

Out[17]: array([ 2., 3.])

(1-2-4) NaN이 포함된 배열 원소 간 더하기 범용 함수 : np.nansum()

np.nansum() 함수는 NaN (Not a Numbers)을 '0'(zero)으로 간주하고 배열 원소 간 더하기를 합니다.

In [15]: d

Out[15]:

array([[ 1., 2.],
[ 3., nan]])

# (1-2-4) np.nansum() : Return the sum of array elements

# over a given axis treating Not a Numbers (NaNs) as zero

In [18]: np.nansum(d, axis=0) # [1+3, 2+0] ↓

Out[18]: array([ 4., 2.])

In [19]: np.nansum(d, axis=1) # [1+2, 3+0] →

Out[19]: array([ 3., 3.])

(1-2-5) 배열 원소 간 누적 곱하기 범용 함수 : np.cumprod()

axis=0 이면 같은 행(column)의 위에서 아래 방향으로 배열 원소들을 누적(cumulative)으로 곱해 나가며, axis=1 이면 같은 열(row)에 있는 배열 원소 간에 왼쪽에서 오른쪽 방향으로 누적으로 곱해 나갑니다.

In [20]: e = np.array([1, 2, 3, 4])

In [21]: e

Out[21]: array([1, 2, 3, 4])

In [22]: f = np.array([[1, 2, 3], [4, 5, 6]])

In [23]: f

Out[23]:

array([[1, 2, 3],
[4, 5, 6]])

# (1-2-5) np.cumprod() : Return the cumulative product of elements along a given axis

In [24]: np.cumprod(e) # [1, 1*2, 1*2*3, 1*2*3*4]

Out[24]: array([ 1, 2, 6, 24], dtype=int32)

In [25]: np.cumprod(f, axis=0) # [[1, 2, 3], [1*4, 2*5, 3*6]] ↓

Out[25]:

array([[ 1, 2, 3],
[ 4, 10, 18]], dtype=int32)

In [26]: np.cumprod(f, axis=1) # [[1, 1*2, 1*2*3], [4, 4*5, 4*5*6]] →

Out[26]:

array([[ 1, 2, 6],
[ 4, 20, 120]], dtype=int32)

(1-2-6) 배열 원소 간 누적 합 구하기 범용 함수 : np.cumsum()

axis=0 이면 같은 행(column)의 위에서 아래 방향으로 배열 원소들을 누적(cumulative)으로 합해 나가며, axis=1 이면 같은 열(row)에 있는 배열 원소 간에 왼쪽에서 오른쪽 방향으로 누적으로 합해 나갑니다.

In [21]: e

Out[21]: array([1, 2, 3, 4])

# (1-2-6) np.cumsum(a, axis) : Return the cumulative sum of the elements along a given axis

In [27]: np.cumsum(e) # [1, 1+2, 1+2+3, 1+2+3+4]

Out[27]: array([ 1, 3, 6, 10], dtype=int32)

In [23]: f

Out[23]:

array([[1, 2, 3],
[4, 5, 6]])

In [28]: np.cumsum(f, axis=0) # [[1, 2, 3], [1+4, 2+5, 3+6]] ↓

Out[28]:

array([[1, 2, 3],
[5, 7, 9]], dtype=int32)

In [29]: np.cumsum(f, axis=1) # [[1, 1+2, 1+2+3], [4, 4+5, 4+5+6]] →

Out[29]:

array([[ 1, 3, 6],
[ 4, 9, 15]], dtype=int32)

(1-2-7) 배열 원소 간 n차 차분 구하기 : np.diff()

# (1-2-7) diff(a, n, axis) : Calculate the n-th discrete difference along given axis

In [30]: g = np.array([1, 2, 4, 10, 13, 20])

In [31]: g

Out[31]: array([ 1, 2, 4, 10, 13, 20])

# 1차 차분 (1st order differencing)

In [32]: np.diff(g) # [2-1, 4-2, 10-4, 13-10, 20-13]

Out[32]: array([1, 2, 6, 3, 7])

# 2차 차분 (2nd order differencing) => 1차 차분 결과 Out[32] 를 가지고 한번 더 차분

In [33]: np.diff(g, n=2) # [2-1, 6-2, 3-6, 7-3] <- using Out[32] array (1st order difference)

Out[33]: array([ 1, 4, -3, 4])

# 3차 차분 (3rd order differencing) => 2차 차분 결과 Out[33] 을 가지고 한번 더 차분

In [34]: np.diff(g, n=3) # [4-1, -3-4, 4-(-3)] <- using Out[33] array (2nd order diffenence)

Out[34]: array([ 3, -7, 7])

2차원 배열의 경우 axis=0 이면 같은 열(column)의 아래에서 위 방향으로 차분(difference)을 하며,

axis=1 이면 같은 행(row)의 오른쪽에서 왼쪽 방향으로 차분을 합니다.

#---- 2 dimentional arrays

In [35]: h = np.array([[1, 2, 4, 8], [10, 13, 20, 15]])

In [36]: h

Out[36]:

array([[ 1, 2, 4, 8],
[10, 13, 20, 15]])

In [37]: np.diff(h, axis=0) # [10-1, 13-2, 20-4, 15-8] ↑

Out[37]: array([[ 9, 11, 16, 7]])

In [38]: np.diff(h, axis=1) # [[2-1, 4-2, 8-4], [13-10, 20-13, 15-20]] ←

Out[38]:

array([[ 1, 2, 4],
[ 3, 7, -5]])

# n=2 이면 1차 차분 결과인 Out[38] 배열에 대해 한번 더 차분

In [39]: np.diff(h, n=2, axis=1) [[2-1, 4-2], [7-3, -5-7]] ←

Out[39]:

array([[ 1, 2],
[ 4, -12]])

(1-2-8) 차분 결과를 1차원 배열(1 dimensional array)로 반환해주는 함수 : ediff1d()

2차원 배열에 대한 차분인 np.diff(h, axis=1) 의 경우 Out[38] 처럼 2차원 배열을 반환합니다. 반면에 ediff1d(h) 함수를 사용하면 Out[41] 처럼 차분 결과를 1차원 배열로 반환합니다.

# (1-2-8) ediff1d(ary[, to_end, to_begin])

# : The differences between consecutive elements of an array

In [31]: g

Out[31]: array([ 1, 2, 4, 10, 13, 20])

In [40]: np.ediff1d(g)

Out[40]: array([1, 2, 6, 3, 7])

# 2 dimensional array

In [36]: h

Out[36]:

array([[ 1, 2, 4, 8],
[10, 13, 20, 15]])

# The returned array is always 1D

In [41]: np.ediff1d(h)

Out[41]: array([ 1, 2, 4, 2, 3, 7, -5]) # 1D array, not 2D array

np.ediff1d() 함수의 시작부분과 끝 부분의 값을 to_begin, to_end 로 설정해줄 수도 있습니다.

In [42]: np.ediff1d(h, to_begin=np.array([-100, -99]), to_end=np.array([99, 100]))

Out[42]: array([-100, -99, 1, 2, 4, 2, 3, 7, -5, 99, 100])

(1-2-9) 기울기(gradient) 구하기 범용 함수 : np.gradient()

gradient는 1차 편미분한 값들로 구성된 배열입니다. 아래 예제에 np.gradient() 함수가 어떻게 계산되는지를 수식을 적어놓았으니 참고하시기 바랍니다. 말로 설명하기가 쉽지가 않네요. ^^;

In [31]: g

Out[31]: array([ 1, 2, 4, 10, 13, 20])

# [(2-1), {(2-1)+(4-2)}/2, {(4-2)+(10-4)}/2, {(10-4)+(13-10)}/2, {(13-10)+(20-13)}/2, (20-13)]

In [43]: np.gradient(g)

Out[43]: array([ 1. , 1.5, 4. , 4.5, 5. , 7. ])

# N scalars specifying the sample distances for each dimension

# x축 1단위가 '2'이므로 양쪽 옆으로 x축 변화에 따른 y값 변화를 보는 것이므로 2(단위)*2(방향)으로 나누어 줌

# [(2-1)/2, {(2-1)+(4-2)}/2*2, {(4-2)+(10-4)}/2*2, {(10-4)+(13-10)}/2*2, {(13-10)+(20-13)}/2*2, (20-13)/2]

In [44]: np.gradient(g, 2)

Out[44]: array([ 0.5 , 0.75, 2. , 2.25, 2.5 , 3.5 ])

# Gradient is calculated using N-th order accurate differences at the boundaries

# 양 옆에만 2차 차분 : 1 - (1.5 -1) = 0.5, 7 + (7-5) = 9

In [45]: np.gradient(g, edge_order=2)

Out[45]: array([ 0.5, 1.5, 4. , 4.5, 5. , 9. ])

아래는 2차원 배열에 대한 gradient 구하는 예제입니다. np.gradient(h, axis=0)과 np.gradient(h, axis=1)을 짬뽕해 놓은 것이 np.gradient(h) 라고 보면 되겠습니다. gradient 방법은 위의 1차원에서 소개한 방법과 같습니다.

# 2 dimensional array

In [36]: h

Out[36]:

array([[ 1, 2, 4, 8],
[10, 13, 20, 15]])

# the first array stands for the gradient in rows and the second one in columns direction

In [46]: np.gradient(h)

Out[46]:

[array([[ 9., 11., 16., 7.],
[ 9., 11., 16., 7.]]),

array([[ 1. , 1.5, 3. , 4. ],
[ 3. , 5. , 1. , -5. ]])]

# The axis keyword can be used to specify a subset of axes of which the gradient is calculated

In [47]: np.gradient(h, axis=0) # ↑

Out[47]:

array([[ 9., 11., 16., 7.],
[ 9., 11., 16., 7.]])

In [48]: np.gradient(h, axis=1) # ←

Out[48]:

array([[ 1. , 1.5, 3. , 4. ],
[ 3. , 5. , 1. , -5. ]])

다음번 포스팅에서는 지수함수, 로그함수, 삼각함수에 대해서 다루어보겠습니다.

많은 도움 되었기를 바랍니다.

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

[Python NumPy] 범용 함수(universal function) : (1-4) 삼각함수(trigonometric functions) (0)	2017.03.13
[Python NumPy] 범용 함수(universal function) : (1-3) 지수함수(exponential function), 로그함수(logarithmic function) (3)	2017.03.12
[Python NumPy] 범용 함수 (universal functions) : (1-1) 단일 배열 unary ufuncs : 올림 혹은 내림 (rounding) (0)	2017.03.05
[Python NumPy] 정수 배열을 사용해서 다차원 배열 인덱싱 하기 : Fancy Indexing (0)	2017.03.01
[Python Numpy] Boolean 조건문으로 배열 인덱싱 (Boolean Indexing) (2)	2017.02.27

Posted by Rfriend

이전 1 다음

R, Python 분석과 프로그래밍의 친구 (by R Friend)

'결측값이 포함된 배열의 곱 함수'에 해당되는 글 1건

[Python NumPy] 범용 함수 (universal functions) : (1-2) 단일 배열 unary ufuncs : 합(sum), 누적합(cum_sum), 곱(product), 누적곱(cum_prod), 차분(difference), gradient 범용함수

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

카테고리

태그목록

티스토리툴바