[PostgreSQL, Greenplum] 문자열 패턴 매칭 (Pattern Matching)

지난번 포스팅에서는 PostgreSQL, Greenplum DB에서 문자열형의 종류, 생성, 처리 연산자(Character Type Operators) 및 함수(Character Functions)에 대해서 소개(https://rfriend.tistory.com/543)하였습니다.

이번 포스팅에서는 PostgreSQL, Greenplum DB에서 문자열 패턴 매칭(String Pattern Matching)하는 세가지 방법을 소개하겠습니다.

(1) 전통적인 SQL LIKE 연산자 (LIKE Operator)

(2) SIMILAR TO 연산자 (the SIMILAR TO Operator)

(3) 정규 표현식 (POSIX-Style Regular Expressions)

(1) SQL LIKE, ~~ 연산자 ('LIKE', or '~~' Operator)

'LIKE' 연산자는 전통적인 SQL에서 문자열 패턴 매칭할 때 가장 보편적으로 사용되며, 아래와 같은 syntax를 사용합니다.

string LIKE pattern [ESCAPE escape-character]

위의 LIKE 연산자는 LIKE 앞의 문자열이 LIKE 뒤의 패턴과 일치할 경우 TRUE를 반환하며, 일치하지 않을 경우는 FALSE를 반환합니다.

와일드카드 문자(wildcard character)로서 (1) 퍼센트 부호(%)는 그 위치부터 어떤 문자열(any string)이 와도 상관없으며, (2) 밑줄 부호(_)는 해당 위치에 하나의 문자가 일치하는지를 봐서 TRUE나 FALSE를 반환합니다. LIKE와 '%', '_' 와일드카드 문자를 사용한 패턴 매칭 예 몇개를 살펴보겠습니다.

pattern matching	returns
SELECT 'abc' LIKE 'abc';	true
SELECT 'abc' LIKE 'a%';	true
SELECT 'abc' LIKE '%b';	false
SELECT 'abc' LIKE '%b%';	true
SELECT 'abc' LIKE '%c';	true
SELECT 'abc' LIKE '_b_';	true
SELECT 'abc' LIKE 'c';	false
SELECT 'abc' LIKE '_c';	false
SELECT 'abc' LIKE '__c';	true

'~~' 연산자를 'LIKE' 연산자 대신 사용할 수 있습니다. 위에 예 중에서 한개만 'LIKE'를 '~~' 연산자로 바꾸어보면 아래와 같습니다. (Jupyter Notebook에서 sql_magic 라이브러리로 SQL문 사용할 때 LIKE 대신 ~~ 연산자를 사용했었습니다.)

SELECT 'abc' ~~ 'a%'; --> returns true

패턴 일치여부를 판단하는데 있어 대소문자를 구분합니다.

SELECT 'abc' LIKE 'a%'; --> returns true

SELECT 'abc' LIKE 'A%'; --> returns false

일치하지 않는 패턴을 찾고 싶으면 'NOT LIKE' 또는 '!~~' 연산자를 사용합니다.

SELECT 'abc' LIKE 'a%'; --> returns true

SELECT 'abc' NOT LIKE 'a%'; --> retruns false

SELECT 'abc' ~~ 'a%'; --> returns true

SELECT 'abc' !~~ 'a%'; --> returns false

(2) SIMILAR TO 연산자 (the SIMILAR TO Operator)

SELECT 'abc' SIMILAR TO 'abc'; -- true

SELECT 'abc' SIMILAR TO 'a'; -- false

SELECT 'abc' SIMILAR TO 'a__'; -- true

SELECT 'abc' SIMILAR TO '_b_'; -- true

SELECT 'abc' SIMILAR TO '_bc'; -- true

SELECT 'abc' SIMILAR TO 'a%'; -- true

SELECT 'abc' SIMILAR TO '%b'; -- false

SELECT 'abc' SIMILAR TO '%b%'; -- true

-- | denotes alternation (either of two alternatives).

-- Parentheses () can be used to group items into a single logical item.

SELECT 'abc' SIMILAR TO '%(b|d)%'; -- true

SELECT 'abc' SIMILAR TO '%(d|e)%'; -- false

-- {m} denotes repetition of the previous item exactly m times.

SELECT 'abbbc' SIMILAR TO '%b{3}%'; -- true

SELECT 'abbbc' SIMILAR TO '%b{4}%'; -- false

-- {m,} denotes repetition of the previous item m or more times.

SELECT 'abbbc' SIMILAR TO '%b{2,}%'; -- true

SELECT 'abbbc' SIMILAR TO '%b{4,}%'; -- false

-- {m,n} denotes repetition of the previous item at least m and not more than n times.

SELECT 'abbbc' SIMILAR TO '%b{2,3}%'; -- true

SELECT substring('foobar' from '%#"o_b#"%' for '#'); -- oob

-- It returns null if there is no match.

SELECT substring('foobar' from '#"o_b#"%' for '#'); -- NULL

SELECT substring('foobar' from 'o...r'); -- oobar

SELECT substring('foobar' from 'o(...)r'); -- oba

(3) 정규 표현식 (POSIX-Style Regular Expressions)

정규표현식을 이용하면 위의 (1) LIKE 연산자, (2) SIMILAR TO 연산자보다 더욱 강력한 패턴 매칭을 할 수 있습니다.

-- Regular Expressions Matching Operators: ~, ~*, !~, !~*

-- ~ operator: Matches regular expression, case sensitive

SELECT 'thomas' ~ '.*thomas.*'; -- true

SELECT 'thomas' ~ '.*Thomas.*'; -- False

-- ~* operator: Matches regular expression, case insensitive

SELECT 'thomas' ~* '.*Thomas.*'; -- true

-- !~ operator: Does not match regular expression, case sensitive

SELECT 'thomas' !~ '.*thomas.*'; -- False

SELECT 'thomas' !~ '.*Thomas.*'; -- True

-- !~* operator: Does not match regular expression, case insensitive

SELECT 'thomas' !~* '.*Thomas.*'; -- false

SELECT 'thomas' !~ '.*Thomas.*'; -- true

SELECT 'thomas' !~* '.*steve.*'; -- true

많은 도움이 되었기를 바랍니다.

이번 포스팅이 도움이 되었다면 아래의 '공감~'를 꾹 눌러주세요. :-)

* Reference: https://www.postgresql.org/docs/9.5/functions-matching.html

728x90

저작자표시 비영리 변경금지 (새창열림)

'Greenplum and PostgreSQL Database' 카테고리의 다른 글

[PostgreSQL, Greenplum] ipython-sql 로 Jupyter Notebook에서 DB 접속하는 4가지 방법 (0)	2020.11.15
[PostgreSQL, Greenplum] ipython-sql, pgspecial로 Jupyter Notebook에서 DB접속하고 SQL query 하기 (0)	2020.10.20
[PostgreSQL, Greenplum] 문자열 처리 연산자 및 함수 (character type operators and functions) (0)	2020.07.04
[PostgreSQL, Greenplum] 산술연산자, 비교연산자, 논리연산자, 기타연산자 (Operators) (0)	2020.06.28
[PostgreSQL, Greenplum DB] 수정 Update, 삭제 Delete, 테이블 비우기 Truncate (0)	2020.06.20

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

R, Python 분석과 프로그래밍의 친구 (by R Friend)

[PostgreSQL, Greenplum] 문자열 패턴 매칭 (Pattern Matching)

'Greenplum and PostgreSQL Database' 카테고리의 다른 글

카테고리

태그목록

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역