이번 포스팅에서는 PostgreSQL, Greenplum database에서 지리공간 데이터 분석 (Geo-Spatial data analysis) 을 할 수 있도록 해주는 외장 확장 오픈 소스 소프트웨어 프로그램인 PostGIS 를 Greenplum docker 위에 설치하는 방법을 소개하겠습니다.
Greenplum DB에 PostGIS 를 설치하는 가이드는 https://gpdb.docs.pivotal.io/5100/ref_guide/extensions/postGIS.html 를 참고하였습니다.
0. (사전 준비 사항) Docker를 이용하여 Greenplum DB + MADlib + PL/x 설치 |
CentOS + Greenplum + MADlib + PL/R + PL/Python 이 설치된 Docker Image를 이용하여 분석환경을 구성하는 자세한 내용은 https://rfriend.tistory.com/379 포스팅을 참고하기 바랍니다.
명령 프롬프트 창을 띄우고 아래 docker 명령어로 greenplum을 간편하게 설치해보세요.
--------------------------------- -- GPDB w/MADlib, PL/x on Docker : https://hub.docker.com/r/hdlee2u/gpdb-analytics --------------------------------- -- (1) Docker Image Pull $ docker pull hdlee2u/gpdb-analytics $ docker images -- (2) Docker Image Run(port 5432) -> Docker Container Creation $ docker run -i -d -p 5432:5432 -p 28080:28080 --name gpdb-ds --hostname mdw hdlee2u/gpdb-analytics /usr/sbin/sshd -D $ docker ps -a -- (3) To Start Greenplum Database and Use psql $ docker exec -it gpdb-ds /bin/bash [root@mdw /]# su - gpadmin [gpadmin@mdw ~]$ gpstart -a .... GPDB start ....
|
CnetOS와 GPDB 버전에 맞는 PostGIS 버전을 다운로드해서 설치를 해야 합니다. IP 확인, CentOS version 확인, MADlib, PL/R 버전 확인, R & Python Data Science Package version 확인하는 방법은 아래를 참고하세요.
- CentOS : release 7.4
- Greenplum Database : ver 5.10.2
- MADlib : ver 1.15
- PL/R : 2.3.2
- DataScienceR : 1.0.1
- DataSciencePython : 1.1.1
------------------------------------- -- IP check [gpadmin@mdw ~]$ [root@mdw ~]# cd /home/gpadmin [root@mdw gpadmin]# ifconfig -a eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.17.0.2 netmask 255.255.0.0 broadcast 172.17.255.255 ether 02:42:ac:11:00:02 txqueuelen 0 (Ethernet) RX packets 25395 bytes 10372326 (9.8 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 25074 bytes 79368842 (75.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ... (이하 생략) -------------------------------------- -- MADlib, PL/R, Python Data Science Package, GP Command Center version check -------------------------------------- [root@mdw gpadmin]# cd /setup [root@mdw setup]# ls -al total 810088 drwxr-xr-x 4 gpadmin gpadmin 4096 Aug 13 2018 . drwxr-xr-x 1 root root 4096 Mar 11 05:08 .. -rw-r--r-- 1 gpadmin gpadmin 218258940 Aug 13 2018 DataSciencePython-1.1.1-gp5-rhel7-x86_64.gppkg -rw-r--r-- 1 gpadmin gpadmin 146189713 Aug 13 2018 DataScienceR-1.0.1-gp5-rhel7-x86_64.gppkg drwxr-xr-x 2 gpadmin gpadmin 4096 Jul 23 2018 greenplum-cc-web-4.3.0-LINUX-x86_64 -rw-r--r-- 1 gpadmin gpadmin 29040039 Aug 13 2018 greenplum-cc-web-4.3.0-LINUX-x86_64.zip -rwxr-xr-x 1 gpadmin gpadmin 197905185 Aug 10 2018 greenplum-db-5.10.2-rhel7-x86_64.bin -rw-r--r-- 1 gpadmin gpadmin 195802895 Aug 13 2018 greenplum-db-5.10.2-rhel7-x86_64.zip -rw-r--r-- 1 gpadmin gpadmin 4 Aug 13 2018 hostfile drwxr-xr-x 2 gpadmin gpadmin 4096 Aug 11 2018 madlib-1.15-gp5-rhel7-x86_64 -rw-r--r-- 1 gpadmin gpadmin 3023537 Aug 13 2018 madlib-1.15-gp5-rhel7-x86_64.tar.gz -rw-r--r-- 1 gpadmin gpadmin 39279994 Aug 13 2018 plr-2.3.2-gp5-rhel7-x86_64.gppkg -------------------------------------- -- CentOS version check [gpadmin@mdw setup]$ cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/" CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" |
1. Pivotal Network에서 PostGIS 다운로드 |
(1) https://network.pivotal.io/ 접속 (다운로드를 위해서는 회원가입 필요)
> (2) 'Pivotal Greenplum Releases: 5.10.2' : https://network.pivotal.io/products/pivotal-gpdb#/releases/158026
> (3) 'Greenplum Adnvanced Analytics' : https://network.pivotal.io/products/pivotal-gpdb#/releases/158026/file_groups/1084
> (4) 'PostGIS 2.1.5+pivotal.1 for RHEL 7' file download
의 순서대로 경로를 찾아가서 PostGIS 2.1.5+pivotal.1 for RHEL 7 파일을 다운로드 합니다.
2. 다운로드한 PostGIS 압축파일을 Greenplum Docker 컨테이너 안으로 복사(copy)하기 |
다른 명령 프롬프트 창을 띄우고, 아래처럼 Downloads 폴더로 경로 변경 후에 docker cp 명령문으로 1번에서 다운로드한 PostGIS 2.1.5 압축 파일을 Greenplum 도커 컨테이너 안의 'gpdb-ds:/setup' 경로로 복사해주세요.
-- [At another terminal window] Copy PostGIS 2.1.5 to GPDB-DS Docker Container ihongdon-ui-MacBook-Pro:~ ihongdon$ pwd /Users/ihongdon ihongdon-ui-MacBook-Pro:~ ihongdon$ cd Downloads/ ihongdon-ui-MacBook-Pro:Downloads ihongdon$ ls -al -rw-r--r--@ 1 ihongdon staff 19839907 3 22 16:28 postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg ihongdon-ui-MacBook-Pro:Downloads ihongdon$ docker cp postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg gpdb-ds:/setup ihongdon-ui-MacBook-Pro:Downloads ihongdon$ |
3. gpadmin 계정에게 postgis-2.1.5 파일에 대한 권한 부여 (chown) |
(1) gpadmin 으로 들어와 있는 명령 프롬프트 창으로 와서 root 계정으로 로그인 후에 => (2) chown 명령어를 이용하여 gpadmin 에 PostGIS 파일에 대한 권한을 부여해줍니다.
-- 파일 소유자나 소유 그룹 변경 : chown [gpadmin@mdw setup]$ su - Password: Last login: Fri Mar 22 07:01:35 UTC 2019 on pts/0 [root@mdw ~]# cd /setup [root@mdw setup]# ls -al total 829464 drwxr-xr-x 1 gpadmin gpadmin 4096 Mar 22 07:33 . drwxr-xr-x 1 root root 4096 Mar 11 05:08 .. -rw-r--r-- 1 gpadmin gpadmin 218258940 Aug 13 2018 DataSciencePython-1.1.1-gp5-rhel7-x86_64.gppkg -rw-r--r-- 1 gpadmin gpadmin 146189713 Aug 13 2018 DataScienceR-1.0.1-gp5-rhel7-x86_64.gppkg drwxr-xr-x 2 gpadmin gpadmin 4096 Jul 23 2018 greenplum-cc-web-4.3.0-LINUX-x86_64 -rw-r--r-- 1 gpadmin gpadmin 29040039 Aug 13 2018 greenplum-cc-web-4.3.0-LINUX-x86_64.zip -rwxr-xr-x 1 gpadmin gpadmin 197905185 Aug 10 2018 greenplum-db-5.10.2-rhel7-x86_64.bin -rw-r--r-- 1 gpadmin gpadmin 195802895 Aug 13 2018 greenplum-db-5.10.2-rhel7-x86_64.zip -rw-r--r-- 1 gpadmin gpadmin 4 Aug 13 2018 hostfile drwxr-xr-x 2 gpadmin gpadmin 4096 Aug 11 2018 madlib-1.15-gp5-rhel7-x86_64 -rw-r--r-- 1 gpadmin gpadmin 3023537 Aug 13 2018 madlib-1.15-gp5-rhel7-x86_64.tar.gz -rw-r--r-- 1 gpadmin gpadmin 39279994 Aug 13 2018 plr-2.3.2-gp5-rhel7-x86_64.gppkg -rw-r--r-- 1 501 games 19839907 Mar 22 07:28 postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg [root@mdw setup]# chown gpadmin:gpadmin postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg |
4. gppkg로 각 Segment 노드에 PostGIS 설치하기 |
(1) 명령 프롬프트 창에서 root 계정에서 exit 후 => gpadmin 계정에서 gppkg -i 로 PostGIS 2.1.5를 설치합니다.
(2) 그러면 로그 메시지에 'gppkg:mdw:gpadmin-[INFO]:-Please run the following commands to enable the PostGIS package: $GPHOME/share/postgresql/contrib/postgis-2.1/postgis_manager.sh mydatabase install'라는 메시지가 나옵니다. 이 메시지를 추가로 실행시킵니다.
-- PostGIS 2.1.5 install [root@mdw setup]# exit logout [gpadmin@mdw setup]$ ls -al total 829464 drwxr-xr-x 1 gpadmin gpadmin 4096 Mar 22 07:33 . drwxr-xr-x 1 root root 4096 Mar 11 05:08 .. -rw-r--r-- 1 gpadmin gpadmin 218258940 Aug 13 2018 DataSciencePython-1.1.1-gp5-rhel7-x86_64.gppkg -rw-r--r-- 1 gpadmin gpadmin 146189713 Aug 13 2018 DataScienceR-1.0.1-gp5-rhel7-x86_64.gppkg drwxr-xr-x 2 gpadmin gpadmin 4096 Jul 23 2018 greenplum-cc-web-4.3.0-LINUX-x86_64 -rw-r--r-- 1 gpadmin gpadmin 29040039 Aug 13 2018 greenplum-cc-web-4.3.0-LINUX-x86_64.zip -rwxr-xr-x 1 gpadmin gpadmin 197905185 Aug 10 2018 greenplum-db-5.10.2-rhel7-x86_64.bin -rw-r--r-- 1 gpadmin gpadmin 195802895 Aug 13 2018 greenplum-db-5.10.2-rhel7-x86_64.zip -rw-r--r-- 1 gpadmin gpadmin 4 Aug 13 2018 hostfile drwxr-xr-x 2 gpadmin gpadmin 4096 Aug 11 2018 madlib-1.15-gp5-rhel7-x86_64 -rw-r--r-- 1 gpadmin gpadmin 3023537 Aug 13 2018 madlib-1.15-gp5-rhel7-x86_64.tar.gz -rw-r--r-- 1 gpadmin gpadmin 39279994 Aug 13 2018 plr-2.3.2-gp5-rhel7-x86_64.gppkg -rw-r--r-- 1 gpadmin gpadmin 19839907 Mar 22 07:28 postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg [gpadmin@mdw setup]$ gppkg -i postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg 20190322:07:36:54:011243 gppkg:mdw:gpadmin-[INFO]:-Starting gppkg with args: -i postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg 20190322:07:36:55:011243 gppkg:mdw:gpadmin-[INFO]:-Installing package postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg 20190322:07:36:55:011243 gppkg:mdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-5.10.2/.tmp/libexpat-2.1.0-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/gdal-1.11.1-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/proj-4.8.0-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/postgis-2.1.5-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/json-c-0.12-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/geos-3.4.2-1.x86_64.rpm --dbpath /usr/local/greenplum-db-5.10.2/share/packages/database --prefix /usr/local/greenplum-db-5.10.2' 20190322:07:36:55:011243 gppkg:mdw:gpadmin-[INFO]:-Installing postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg locally 20190322:07:36:56:011243 gppkg:mdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-5.10.2/.tmp/libexpat-2.1.0-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/gdal-1.11.1-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/proj-4.8.0-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/postgis-2.1.5-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/json-c-0.12-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/geos-3.4.2-1.x86_64.rpm --dbpath /usr/local/greenplum-db-5.10.2/share/packages/database --prefix /usr/local/greenplum-db-5.10.2' 20190322:07:36:56:011243 gppkg:mdw:gpadmin-[INFO]:-Installing rpms cmdStr='rpm -i /usr/local/greenplum-db-5.10.2/.tmp/libexpat-2.1.0-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/gdal-1.11.1-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/proj-4.8.0-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/postgis-2.1.5-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/json-c-0.12-1.x86_64.rpm /usr/local/greenplum-db-5.10.2/.tmp/geos-3.4.2-1.x86_64.rpm --dbpath /usr/local/greenplum-db-5.10.2/share/packages/database --prefix=/usr/local/greenplum-db-5.10.2' 20190322:07:37:01:011243 gppkg:mdw:gpadmin-[INFO]:-Completed local installation of postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg. 20190322:07:37:01:011243 gppkg:mdw:gpadmin-[INFO]:-Please run the following commands to enable the PostGIS package: $GPHOME/share/postgresql/contrib/postgis-2.1/postgis_manager.sh mydatabase install 20190322:07:37:01:011243 gppkg:mdw:gpadmin-[INFO]:-postgis-2.1.5+pivotal.1-gp5-rhel7-x86_64.gppkg successfully installed. [gpadmin@mdw setup]$ cd $GPHOME [gpadmin@mdw greenplum-db]$ cd share [gpadmin@mdw share]$ ls gdal greenplum packages postgresql proj [gpadmin@mdw share]$ cd postgresql/ [gpadmin@mdw postgresql]$ cd contrib/ [gpadmin@mdw contrib]$ ls citext.sql gp_distribution_policy.sql gp_svec_test.sql oid2name.txt postgis-2.1 uninstall_fuzzystrmatch.sql uninstall_hstore.sql dblink.sql gp_session_state.sql hstore.sql orafunc.sql uninstall_citext.sql uninstall_gp_distribution_policy.sql uninstall_orafunc.sql fuzzystrmatch.sql gp_sfv_test.sql indexscan.sql pgcrypto.sql uninstall_dblink.sql uninstall_gp_session_state.sql uninstall_pgcrypto.sql [gpadmin@mdw contrib]$ cd postgis-2.1/ [gpadmin@mdw postgis-2.1]$ ls install postgis_manager.sh uninstall upgrade [gpadmin@mdw postgis-2.1]$ $GPHOME/share/postgresql/contrib/postgis-2.1/postgis_manager.sh gpadmin install SET BEGIN DO CREATE FUNCTION CREATE FUNCTION CREATE FUNCTION CREATE TYPE CREATE FUNCTION : : INSERT 0 1 INSERT 0 1 COMMIT ANALYZE [gpadmin@mdw postgis-2.1]$ : |
자, 이제 PostGIS가 Greenplum docker 컨테이너 안에 설치가 되었습니다.
5. PostGIS 샘플 Query 실행해서 테스트해보기 |
DBeaver DB tool로 아래의 PostGIS 테이블 생성해보고 select query 를 날려보겠습니다.
-- Create PostGIS extension CREATE EXTENSION postgis; -- PostGIS version check SELECT PostGIS_Version(); -- PostGIS sample query CREATE TABLE geom_test ( gid int4, geom geometry, name varchar(25) ); INSERT INTO geom_test ( gid, geom, name ) VALUES ( 1, 'POLYGON((0 0 0,0 5 0,5 5 0,5 0 0,0 0 0))', '3D Square'); INSERT INTO geom_test ( gid, geom, name ) VALUES ( 2, 'LINESTRING(1 1 1,5 5 5,7 7 5)', '3D Line' ); INSERT INTO geom_test ( gid, geom, name ) VALUES ( 3, 'MULTIPOINT(3 4,8 9)', '2D Aggregate Point' );
SELECT * from geom_test WHERE geom && Box3D(ST_GeomFromEWKT('LINESTRING(2 2 0, 3 3 0)')); |
잘 작동하는군요. ^^
많은 도움이 되었기를 바랍니다.
이번 포스팅이 도움이 되었다면 아래의 '공감~'를 꾸욱 눌러주세요.