이번 포스팅에서는 리눅스의 grep 을 사용하여 하나의 파일이나 디렉토리 안의 여러개의 파일에서 특정 패턴에 매칭되는 행을 찾아서 출력하는 다양한 옵션, 방법을 소개하겠습니다.  특히 정규 표현식(regular expression) 까지 함께 사용하면 매우 강력하게 파일 내 특정 패턴을 찾을 수 있어서 굉장히 유용합니다. 



[ LINUX grep command : 문자열 패턴 검색 후 출력 ]





먼저, 예제로 사용할 수 있도록 demo_file.txt의 텍스트 파일을 준비하였습니다. 



[MacBook-Pro:~ rfriend$ ssh gpadmin@192.168.188.131

gpadmin@192.168.188.131's password: xxxxxxxx

Last login: Sat Dec  9 01:06:33 2017 from 192.168.188.1

[gpadmin@mdw ~]$ ls -al

합계 544860

drwx------  8 gpadmin gpadmin      4096 2017-12-09 01:12 .

drwxr-xr-x. 4 root    root         4096 2017-11-08 20:04 ..

-rw-------  1 gpadmin gpadmin      4792 2017-12-09 01:10 .bash_history

-rw-r--r--  1 gpadmin gpadmin        18 2014-10-16 22:56 .bash_logout

-rw-r--r--  1 gpadmin gpadmin       398 2017-12-04 13:45 .bash_profile

-rw-r--r--  1 gpadmin gpadmin       124 2014-10-16 22:56 .bashrc

-rw-rw-r--  1 gpadmin gpadmin        28 2017-11-08 20:12 .gphostcache

drwxrwxr-x  2 gpadmin gpadmin      4096 2017-12-04 13:43 .oracle_jre_usage

-rw-------  1 gpadmin gpadmin        32 2017-11-08 20:21 .pgpass

-rw-rw-r--  1 gpadmin gpadmin         0 2017-11-08 20:21 .pgpass.1510140115

-rw-------  1 gpadmin gpadmin      2066 2017-12-09 00:21 .psql_history

drwx------  2 gpadmin gpadmin      4096 2017-11-08 20:05 .ssh

drwxrwxr-x  2 gpadmin gpadmin      4096 2017-12-06 20:10 command

-rw-r--r--  1 gpadmin gpadmin       233 2017-12-09 01:12 demo_file.txt

drwxrwxr-x  2 gpadmin gpadmin      4096 2017-12-08 09:45 gpAdminLogs

drwxr-xr-x  2 gpadmin gpadmin      4096 2017-11-08 20:14 gpconfigs

drwxrwxr-x  2 gpadmin gpadmin      4096 2017-12-07 20:29 gptext

-rwxr-xr-x  1 gpadmin gpadmin      2916 2017-12-04 13:42 gptext_install_config

-rwxr-xr-x  1 gpadmin gpadmin 192264305 2017-09-19 09:20 greenplum-text-2.1.3-rhel6_x86_64.bin

-rw-r--r--  1 gpadmin gpadmin 191435368 2017-12-04 12:36 greenplum-text-2.1.3-rhel6_x86_64.tar.gz

-rw-r--r--  1 gpadmin gpadmin 174157387 2017-12-04 13:28 jdk-8u161-linux-x64.rpm


[gpadmin@mdw ~]$ cat demo_file.txt

THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

this line is the 1st lower case line in this file.

This Line Has All Its First Character Of The Word With Upper Case.


Two lines above this line is empty.

And this is the last line.


[gpadmin@mdw ~]$ 

 



grep 의 기본 syntax 는 다음과 같습니다. 


  grep [OPTIONS] PATTERN [FILE...] 



하나씩 예를 들어가면서 설명하겠습니다. 




1. 하나의 파일에서 한개의 패턴을 검색하여 해당되는 행 출력하기: grep PATTERN [FILE]



[gpadmin@mdw ~]$ grep "line" demo_file.txt

this line is the 1st lower case line in this file.

Two lines above this line is empty.

And this is the last line.

[gpadmin@mdw ~]$ 

 




2. grep 검색 패턴에 맞는 부분은 색깔을 다르게 해서 보기: --color



[gpadmin@mdw ~]$ grep  --color  "line"  demo_file.txt

this line is the 1st lower case line in this file.

Two lines above this line is empty.

And this is the last line.

[gpadmin@mdw ~]$ 

 



grep --color 앞부분에 GREP_COLOR="1;32" 로 패턴과 매칭되는 부분의 색깔을 지정할 수 있습니다. 

  • GREP_COLOR="1;32"  : 초록색 (green)
  • GREP_COLOR=:1;34"  : 보라색 (violet)
  • GREP_COLOR="1;36"  : 하늘색 (light blue)




3. 대소문자 구분없이(case in-sensitive) 패턴 검색하여 행 출력: -i


2번에서는 검색이 안되었던 대문자 "LINE"도 이번에는 검색이 되었습니다. 



[gpadmin@mdw ~]$ grep  --color  -i  "line"  demo_file.txt

THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

this line is the 1st lower case line in this file.

This Line Has All Its First Character Of The Word With Upper Case.

Two lines above this line is empty.

And this is the last line.

[gpadmin@mdw ~]$ 

 




4. 파일 이름에 와이드카드 *(Asterisk)를 사용하여 다수의 파일에서 패턴 검색하여 행 출력: filename*.txt


먼저 cp를 사용하여 demo_file.txt 파일을 demo_file2.txt 라는 파일 이름으로 복사를 해서 2개의 파일을 만들었습니다. 



[gpadmin@mdw ~]$ cp  demo_file.txt  demo_file2.txt




다음으로 파일 이름의 중간에 와이드카드 '*(Asterisk)'를 사용하여 파일 이름 demo_*.txt 에서 * 부분에 무엇이 있든지 간에 이에 해당하는 다수의 파일을 찾아서 패턴 검색 후 출력을 하게 됩니다. 3번에서와는 달리 이번에는 출력이 될 때 파일 이름(demo_file.txt, demo_file2.txt)이 행의 출력 맨 앞 부분에 나타납니다. 



[gpadmin@mdw ~]$ grep --color -i "line" demo_*.txt

demo_file.txt:THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

demo_file.txt:this line is the 1st lower case line in this file.

demo_file.txt:This Line Has All Its First Character Of The Word With Upper Case.

demo_file.txt:Two lines above this line is empty.

demo_file.txt:And this is the last line.

demo_file2.txt:THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

demo_file2.txt:this line is the 1st lower case line in this file.

demo_file2.txt:This Line Has All Its First Character Of The Word With Upper Case.

demo_file2.txt:Two lines above this line is empty.

demo_file2.txt:And this is the last line.





5. 서브 디렉토리까지 반복적으로 패턴 검색하여 행 출력: * (Asterisk only, without filename)



[gpadmin@mdw ~]$ grep  --color -i  "line"  *

demo_file.txt:THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

demo_file.txt:this line is the 1st lower case line in this file.

demo_file.txt:This Line Has All Its First Character Of The Word With Upper Case.

demo_file.txt:Two lines above this line is empty.

demo_file.txt:And this is the last line.

demo_file2.txt:THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

demo_file2.txt:this line is the 1st lower case line in this file.

demo_file2.txt:This Line Has All Its First Character Of The Word With Upper Case.

demo_file2.txt:Two lines above this line is empty.

demo_file2.txt:And this is the last line.

greenplum-text-2.1.3-rhel6_x86_64.bin:support, invoicing, and online services. Licensee is responsible for obtaining

greenplum-text-2.1.3-rhel6_x86_64.bin:    if tablog.getNumLines() > 1:

greenplum-text-2.1.3-rhel6_x86_64.bin:        for line in open(to_append_file).readlines():

greenplum-text-2.1.3-rhel6_x86_64.bin:            line = line.strip()

greenplum-text-2.1.3-rhel6_x86_64.bin:            if line.startswith('#') or line == '':

greenplum-text-2.1.3-rhel6_x86_64.bin:            if len(line.split()) != 1:

greenplum-text-2.1.3-rhel6_x86_64.bin:        for line in open(to_append_file).readlines():

greenplum-text-2.1.3-rhel6_x86_64.bin:            line = line.strip()

greenplum-text-2.1.3-rhel6_x86_64.bin:            if line.startswith('#') or line == '':

greenplum-text-2.1.3-rhel6_x86_64.bin:            if line.find('=>') == -1:

greenplum-text-2.1.3-rhel6_x86_64.bin:            parts = line.split('=>')

greenplum-text-2.1.3-rhel6_x86_64.bin:        for line in open(to_append_file).readlines():

greenplum-text-2.1.3-rhel6_x86_64.bin:            line = line.strip()

greenplum-text-2.1.3-rhel6_x86_64.bin:            if line.startswith('#') or line == '':

greenplum-text-2.1.3-rhel6_x86_64.bin:            if line.find('=>') != -1:

greenplum-text-2.1.3-rhel6_x86_64.bin:                parts = line.split('=>')

greenplum-text-2.1.3-rhel6_x86_64.bin:            sys.stdin.readline()

-- 중간 생략 --

Binary file greenplum-text-2.1.3-rhel6_x86_64.tar.gz matches

Binary file jdk-8u161-linux-x64.rpm matches





6. 검색 조건을 뒤집어서(invert), 매칭되지 않는 행을 검색하여 출력: -v



[gpadmin@mdw ~]$ cat demo_file.txt

THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

this line is the 1st lower case line in this file.

This Line Has All Its First Character Of The Word With Upper Case.


Two lines above this line is empty.

And this is the last line.

[gpadmin@mdw ~]$ 

[gpadmin@mdw ~]$ 

[gpadmin@mdw ~]$ grep  -i  -v  "case"  demo_file.txt


Two lines above this line is empty.

And this is the last line.





7. 매칭되는 행의 번호 같이 보기: -n



[gpadmin@mdw ~]$ grep  --color  --n  "case"  demo_file.txt

1:THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

2:this line is the 1st lower case line in this file.

3:This Line Has All Its First Character Of The Word With Upper Case.





8. 전체 행이 정확히 일치하는 행만 검색하여 출력: -x



[gpadmin@mdw ~]$ grep  -n  -x  "this line is the 1st lower case line in this file."  demo_file.txt

2:this line is the 1st lower case line in this file.





9. 각 인풋 파일의 매칭되는 행의 개수 출력: -c, --count



[gpadmin@mdw ~]$ grep  -c  "line"  demo_file.txt

3

[gpadmin@mdw ~]$ grep  --count  "line"  demo_file.txt

3





10. 패턴 매칭된 행의 전(Before), 후(After), 중간(Center)의 전/후 행 NUM 개수 만큼 같이 출력하기  : -B NUM, -A NUM, -C NUM



--   10-1. 패턴 매칭된 행의 이전(Before) 행도 NUM 개수 만큼 출력: -B NUM



[gpadmin@mdw ~]$ cat  demo_file.txt

THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

this line is the 1st lower case line in this file.

This Line Has All Its First Character Of The Word With Upper Case.


Two lines above this line is empty.

And this is the last line.


[gpadmin@mdw ~]$ 

[gpadmin@mdw ~]$ grep  --color -n  -B 2 "First"  demo_file.txt

1-THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

2-this line is the 1st lower case line in this file.

3:This Line Has All Its First Character Of The Word With Upper Case.

[gpadmin@mdw ~]$ 

 



--   10-2. 패턴 매칭된 행의 이후(After) 행도 NUM 개수 만큼 출력: -A NUM



[gpadmin@mdw ~]$ cat demo_file.txt

THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

this line is the 1st lower case line in this file.

This Line Has All Its First Character Of The Word With Upper Case.


Two lines above this line is empty.

And this is the last line.


[gpadmin@mdw ~]$ 

[gpadmin@mdw ~]$ grep  --color  -n  -A 2  "First"  demo_file.txt

3:This Line Has All Its First Character Of The Word With Upper Case.

4-

5-Two lines above this line is empty.

[gpadmin@mdw ~]$ 

 



--   10-3. 패턴 매칭된 행을 가운데(Center)에 두고 앞 뒤 행도 NUM 개수 만큼 출력: -C NUM



[gpadmin@mdw ~]$ cat demo_file.txt

THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

this line is the 1st lower case line in this file.

This Line Has All Its First Character Of The Word With Upper Case.


Two lines above this line is empty.

And this is the last line.


[gpadmin@mdw ~]$ 

[gpadmin@mdw ~]$ grep --color -n  -C 2 "First" demo_file.txt

1-THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

2-this line is the 1st lower case line in this file.

3:This Line Has All Its First Character Of The Word With Upper Case.

4-

5-Two lines above this line is empty.

[gpadmin@mdw ~]$ 

 




11. 매칭되는 행의 개수가 NUM 을 넘으면 파일 읽기 중단: -m NUM



[gpadmin@mdw ~]$ grep  --color  -n  -i  -m 2  "line"  demo_file.txt

1:THIS LINE IS THE 1ST UPPER CASE LINE IN THIS FILE.

2:this line is the 1st lower case line in this file.

[gpadmin@mdw ~]$ 

 




12. 매칭되는 행 중에서 오직 일치하는(Only matching) 부분만 출력: -o, --only-matching



[gpadmin@mdw ~]$ grep  -n  -o  "First"  demo_file.txt

3:First

[gpadmin@mdw ~]$ grep  -n  --only-matching  "First"  demo_file.txt

3:First

 




13. 행 안에서 패턴과 매칭되는 위치(position)를 출력: -b



[gpadmin@mdw ~]$ grep  -o  -b "First" demo_file.txt

124:First

 




14. 패턴과 매칭되는 파일 이름 출력: -l



[gpadmin@mdw ~]$ grep  -l  "First"  demo_*

demo_file.txt

demo_file2.txt

 




15. grep 옵션 요약 출력 후 exit: --help


오늘 포스팅 내용 중에서 제일 중요한(?), 다른 건 다 까먹어도 이것 하나만 기억하고 있으면 든든한 옵션이 되겠습니다. 

도와주세요 help~!  ^^



[gpadmin@mdw ~]$ grep --help

사용법: grep [옵션]... 패턴 [파일]...

Search for PATTERN in each FILE or standard input.

PATTERN is, by default, a basic regular expression (BRE).

Example: grep -i 'hello world' menu.h main.c


Regexp selection and interpretation:

  -E, --extended-regexp     PATTERN is an extended regular expression (ERE)

  -F, --fixed-strings       PATTERN is a set of newline-separated fixed strings

  -G, --basic-regexp        PATTERN is a basic regular expression (BRE)

  -P, --perl-regexp         PATTERN is a Perl regular expression

  -e, --regexp=PATTERN      use PATTERN for matching

  -f, --file=FILE           obtain PATTERN from FILE

  -i, --ignore-case         ignore case distinctions

  -w, --word-regexp         force PATTERN to match only whole words

  -x, --line-regexp         force PATTERN to match only whole lines

  -z, --null-data           a data line ends in 0 byte, not newline


Miscellaneous:

  -s, --no-messages         suppress error messages

  -v, --invert-match        select non-matching lines

  -V, --version             print version information and exit

      --help                display this help and exit

      --mmap                ignored for backwards compatibility


Output control:

  -m, --max-count=NUM       stop after NUM matches

  -b, --byte-offset         print the byte offset with output lines

  -n, --line-number         print line number with output lines

      --line-buffered       flush output on every line

  -H, --with-filename       print the filename for each match

  -h, --no-filename         suppress the prefixing filename on output

      --label=LABEL         print LABEL as filename for standard input

  -o, --only-matching       show only the part of a line matching PATTERN

  -q, --quiet, --silent     suppress all normal output

      --binary-files=TYPE   assume that binary files are TYPE;

                            TYPE is `binary', `text', or `without-match'

  -a, --text                equivalent to --binary-files=text

  -I                        equivalent to --binary-files=without-match

  -d, --directories=ACTION  how to handle directories;

                            ACTION is `read', `recurse', or `skip'

  -D, --devices=ACTION      how to handle devices, FIFOs and sockets;

                            ACTION is `read' or `skip'

  -R, -r, --recursive       equivalent to --directories=recurse

      --include=FILE_PATTERN  search only files that match FILE_PATTERN

      --exclude=FILE_PATTERN  skip files and directories matching FILE_PATTERN

      --exclude-from=FILE   skip files matching any file pattern from FILE

      --exclude-dir=PATTERN  directories that match PATTERN will be skipped.

  -L, --files-without-match  print only names of FILEs containing no match

  -l, --files-with-matches  print only names of FILEs containing matches

  -c, --count               print only a count of matching lines per FILE

  -T, --initial-tab         make tabs line up (if needed)

  -Z, --null                print 0 byte after FILE name


Context control:

  -B, --before-context=NUM  print NUM lines of leading context

  -A, --after-context=NUM   print NUM lines of trailing context

  -C, --context=NUM         print NUM lines of output context

  -NUM                      same as --context=NUM

      --color[=WHEN],

      --colour[=WHEN]       use markers to highlight the matching strings;

                            WHEN is `always', `never', or `auto'

  -U, --binary              do not strip CR characters at EOL (MSDOS)

  -u, --unix-byte-offsets   report offsets as if CRs were not there (MSDOS)


`egrep' means `grep -E'.  `fgrep' means `grep -F'.

Direct invocation as either `egrep' or `fgrep' is deprecated.

With no FILE, or when FILE is -, read standard input.  If less than two FILEs

are given, assume -h.  Exit status is 0 if any line was selected, 1 otherwise;

if any error occurs and -q was not given, the exit status is 2.


Report bugs to: bug-grep@gnu.org

GNU Grep home page: <http://www.gnu.org/software/grep/>

General help using GNU software: <http://www.gnu.org/gethelp/>

[gpadmin@mdw ~]$ 

 



다음번 포스팅에서는 정규 표현식(regular expression)에 대해서 알아보겠습니다. 


많은 도움이 되었기를 바랍니다. 


* Reference: https://www.computerhope.com/unix/ugrep.htm








Posted by R Friend R_Friend