[Python] numpy 배열 외부 파일로 저장하기(save), 외부 파일을 배열로 불러오기(load)

Python 분석과 프로그래밍/Python 데이터 전처리 2018. 5. 21. 23:33

이번 포스팅에서는 Python NumPy 배열(array) 데이터를 외부 파일로 저장(save)하는 방법, 외부 파일을 배열로 불러오는(load) 방법에 대해서 알아보겠습니다.

np.save() : 1개의 배열을 NumPy format의 바이너리 파일로 저장하기 (Save a single array to a binary file in NumPy format)
np.load() : np.save()로 저장된 *.npy 파일을 배열로 불러오기 (Open a *.npy file and load it as an array)

np.savez() : 여러개의 배열을 1개의 압축되지 않은 *.npz 포맷 파일로 저장하기 (Save several arrays into a single file in uncompressed .npz format)
np.load() : np.savez()로 저장된 *.npz 파일을 배열로 불러오기 (Open a *.npz file and load it as an array)

np.savez_compressed() : 여러개의 배열을 1개의 압축된 *.npz 포맷 파일로 저장하기 (Save several arrays into a single file in compressed .npz format)
np.load() : np.save_compressed()로 저장된 압축된 *.npz 파일을 배열러 불러오기 (Open a compressed *.npz file and load it as an array)

np.savetext() : 여러개의 배열을 텍스트 파일로 저장하기 (Save several array to a file as plain text)
np.loadtext() : 텍스트 파일을 배열로 불러오기 (Open a text file and load it as an array)

[ Python NumPy 배열을 파일로 저장하기(save), 불러오기(load) ]

하나씩 간단한 예를 들어서 설명하겠습니다.

> np.save() : 1개의 배열을 NumPy format의 바이너리 파일로 저장하기

> np.load() : np.save()로 저장된 *.npy 파일을 배열로 불러오기

In [1]: import numpy as np

In [2]: x = np.array([0, 1, 2, 3, 4])

# 배열을 저장하기

In [3]: np.save('D:/admin/Documents/x_save', x) # x_save.npy

[ .npy 형식으로 저장된 파일 ]

# 배열로 불러오기

In [4]: x_save_load = np.load('D:/admin/Documents/x_save.npy')

In [5]: x_save_load

Out[5]: array([0, 1, 2, 3, 4])

> np.savez() : 여러개의 배열을 1개의 압축되지 않은 *.npz 포맷 파일로 저장하기

> np.load() : np.savez()로 저장된 *.npz 파일을 배열로 불러오기

In [6]: x = np.array([0, 1, 2, 3, 4])

In [7]: y = np.array([5, 6, 7, 8, 9])

In [8]: np.savez('D:/admin/Documents/xy_savez'

...: , x=x, y=y) # 각 배열에 이름 부여

[ .npz 형식으로 저장된 파일 ]

np.load() 함수로 .npz 파일을 열어서 배열로 불러올 수 있습니다. 이때 불러온 파일의 type은 'numpy.lib.npyio.NpzFile' 이며, 개별 배열을 indexing 하려면 [ ] 를 사용합니다.

# 배열로 불러오기

In [9]: xy_savez_load = np.load('D:/admin/Documents/xy_savez.npz')

In [10]: type(xy_savez_load)

Out[10]: numpy.lib.npyio.NpzFile

In [11]: xy_savez_load['x']

Out[11]: array([0, 1, 2, 3, 4])

In [12]: xy_savez_load['y']

Out[12]: array([5, 6, 7, 8, 9])

np.load() 함수로 연 파일을 더이상 사용할 일이 없으면 메모리 효율 관리를 위해 file.close() 로 닫아주어야 합니다. .close() 로 파일을 닫은 상태에서 indexing 을 하려면 'NoneType' object has no attribute 'open' 에러가 납니다.

In [13]: xy_savez_load.close()

In [14]: xy_savez_load['x'] # AttributeError: 'NoneType' object has no attribute 'open'

Traceback (most recent call last):

File "<ipython-input-14-14d248a305d2>", line 1, in <module>

xy_savez_load['x'] # AttributeError: 'NoneType' object has no attribute 'open'

File "C:\Users\admin\Anaconda3\envs\py_v36\lib\site-packages\numpy\lib\npyio.py", line 226, in __getitem__

bytes = self.zip.open(key)

AttributeError: 'NoneType' object has no attribute 'open'

> np.savez_compressed() : 여러개의 배열을 1개의 압축된 *.npz 포맷 파일로 저장하기

> np.load() : np.save_compressed()로 저장된 압축된 *.npz 파일을 배열러 불러오기

In [15]: x = np.arange([0, 1, 2, 3, 4])

In [16]: y = np.array([5, 6, 7, 8, 9])

In [17]: np.savez_compressed('D:/admin/Documents/xy_savez_compress'

...: , x=x, y=y)

[ .npz 형식으로 압축되어 저장된 파일 ]

np.load() 함수로 불러오기를 하면 'numpy.lib.npyio.NpzFile' type 이며, [ ] 를 사용해서 배열을 indexing 할 수 있습니다. 사용을 끝냈으면 .close() 함수로 닫아줍니다.

In [18]: xy_savez_compress_load = np.load('D:/admin/Documents/xy_savez_compress.npz')

In [19]: type(xy_savez_compress_load)

Out[19]: numpy.lib.npyio.NpzFile

In [20]: xy_savez_compress_load['x']

Out[20]: array([0, 1, 2, 3, 4])

In [21]: xy_savez_compress_load['y']

Out[21]: array([5, 6, 7, 8, 9])

In [22]: xy_savez_compress_load.close()

> np.savetext() : 여러개의 배열을 텍스트 파일로 저장하기

> np.loadtext() : 텍스트 파일을 배열로 불러오기

header, footer 로 '#'으로 시작되는 부가설명을 추가할 수 있습니다.

fmt 로 포맷을 지정할 수 있습니다. 아래 예에서는 소수점 2자리까지만 고정된 자리수로 표현하도록 해보았습니다.

In [23]: x = np.array([0, 1, 2, 3, 4])

In [24]: y = np.array([5, 6, 7, 8, 9])

In [25]: np.savetxt('D:/admin/Documents/xy_savetxt.txt'

...: , (x, y) # x,y equal sized 1D arrays

...: , header='--xy save start--'

...: , footer='--xy save end--'

...: , fmt='%1.2f') # the second digit after the decimal point

[ Text file 로 저장된 배열 ]

np.loadtxt() 함수로 텍스트 파일을 배열로 불러올 수 있으며, ndarray type 으로 바로 불러오게 됩니다.

In [26]: xy_savetxt_load = np.loadtxt('D:/admin/Documents/xy_savetxt.txt')

In [27]: xy_savetxt_load

Out[27]:

array([[ 0., 1., 2., 3., 4.],

[ 5., 6., 7., 8., 9.]])

In [28]: type(xy_savetxt_load)

Out[28]: numpy.ndarray

2D array 도 텍스트 파일로 저장할 수 있습니다.

In [29]: x2 = np.arange(12).reshape(3, 4)

In [30]: x2

Out[30]:

array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

In [31]: np.savetxt('D:/admin/Documents/x2_savetxt.txt'

...: , x2

...: , fmt='%1.2f')

[ Text 파일로 저장된 2D 배열 ]

np.loadtxt() 함수로 텍스트 파일을 배열로 불러올 수 있습니다. 원래의 x2 배열과 정확하게 동일하게 잘 불러왔습니다.

In [32]: x2_savetxt_load = np.loadtxt('D:/admin/Documents/x2_savetxt.txt')

In [33]: x2_savetxt_load

Out[33]:

array([[ 0., 1., 2., 3.],

[ 4., 5., 6., 7.],

[ 8., 9., 10., 11.]])

많은 도움이 되었기를 바랍니다.

728x90

저작자표시 비영리 변경금지

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

[Python NumPy] 선형대수 함수 (Linear Algebra) (0)	2018.08.15
[Python] numpy 배열을 여러개의 하위 배열로 분할하기 (split an array into sub-arrays) (0)	2018.05.22
[Python] numpy array 정렬, 거꾸로 정렬, 다차원 배열 정렬 (2)	2018.05.18
[Python] numpy 최소, 최대, 조건 색인값 : np.argmin(), np.argmax(), np.where() (7)	2018.05.17
[Python] numpy 집합함수 (set functions) (0)	2018.05.17

Posted by Rfriend

R, Python 분석과 프로그래밍의 친구 (by R Friend)

[Python] numpy 배열 외부 파일로 저장하기(save), 외부 파일을 배열로 불러오기(load)

'Python 분석과 프로그래밍 > Python 데이터 전처리' 카테고리의 다른 글

카테고리

태그목록

티스토리툴바