Everyone Can Learn Linux Awk Cmmand.

CCNA 200-301

CCNA 200-301

CCNP Enterprise

CCNP Enterprise

CCNP Security

CCNP Security

CCIE Enterprise Lab

CCIE Enterprise Lab

CCIE Security Lab

CCIE Security Lab

CCNP Service Provider

CCNP Service Provider

CCNP Data Center

CCNP Data Center

CCNP Collaboration

CCNP Collaboration

CCIE DC Lab

CCIE DC Lab

ic_r
ic_l
Everyone Can Learn Linux Awk Cmmand.
images

Introduction

Awk is a powerful text analysis tool. Compared to grep search, sed editing, awk is particularly powerful when it analyzes data and generates reports. Simply put, awk reads the file line by line, and uses space as the default separator to slice each line. The cut part is then analyzed by SPOTO.

There are 3 different versions of awk: awk, nawk and gawk, not specifically stated, generally referred to as gawk, gawk is the GNU version of AWK.

The name awk comes from the first letters of its founders Alfred Aho, Peter Weinberger, and Brian Kernighan. In fact, AWK does have its own language: AWK programming language, which has been officially defined by the three creators as "style scanning and processing languages." It allows you to create short programs that read input files, sort data, process data, perform calculations on inputs, and generate reports, as well as countless other features.

Instructions

Awk '{pattern + action}' {filenames}

Although the operation can be complicated, the syntax is always the same, where pattern represents what AWK looks for in the data, and action is a series of commands that are executed when a match is found. Braces ({}) do not need to appear all the time in the program, but they are used to group a series of instructions according to a particular pattern. The pattern is the regular expression to be represented, enclosed in slashes.

The most basic function of the awk language is to browse and extract information based on specified rules in files or strings. After awk extracts information, other text operations can be performed. A complete awk script is usually used to format information in a text file.

Usually, awk is a unit of processing of a file. Each time awk receives a line of the file, it then executes the corresponding command to process the text.

1. Call awk

There are three ways to call awk

Command line mode

Awk [-F field-separator] 'commands' input-file(s)

Among them, commands are true awk commands, and [-F field separators] are optional. Input-file(s) is the file to be processed.

In awk, in each line of a file, each item separated by a domain separator is called a field. Typically, in the case of the unnamed -F domain separator, the default domain separator is a space.

2. Shell script mode

Insert all awk commands into a file and make the awk program executable, then the awk command interpreter acts as the first line of the script and is called again by typing the script name.

Equivalent to the first line of the shell script: #!/bin/sh

Can be replaced with: #!/bin/awk

3. Insert all awk commands into a separate file and then call:

Awk -f awk-script-file input-file(s)

Among them, the -f option loads the awk script in awk-script-file, and input-file(s) is the same as above.

Getting started example

Suppose the output of last -n 5 is as follows

# last -n 5 Take only the first five lines

Root pts/1 192.168.1.100 Tue Feb 10 11:21 still logged in

Root pts/1 192.168.1.100 Tue Feb 10 00:46 - 02:28 (01:41)

Root pts/1 192.168.1.100 Mon Feb 9 11:41 - 18:30 (06:48)

Dmtsai pts/1 192.168.1.100 Mon Feb 9 11:41 - 11:41 (00:00)

Root tty1 Fri Sep 5 14:09 - 14:10 (00:01)

If only the 5 accounts recently logged in are displayed

#last -n 5 | awk '{print $1}'

Root

Root

Root

Dmtsai

Root

The awk workflow is like this: read a record with 'n' newline split, then divide the record by the specified domain separator, fill the field, $0 for all domains, $1 for the first field, $n Represents the nth domain. The default domain separator is "blank key" or "key", so $1 means the logged in user, $3 means the logged in user IP, and so on.

If only the account showing /etc/passwd is displayed

#cat /etc/passwd |awk -F ':' '{print $1}'

Root

Daemon

Bin

Sys

This is an example of awk+action, which executes action{print $1} on each line.

-F specifies the domain separator to be ':'.

If only the account corresponding to /etc/passwd and the account corresponding to the account are displayed, and the account and the shell are separated by tab key.

#cat /etc/passwd |awk -F ':' '{print $1"t"$7}'

Root /bin/bash

Daemon /bin/sh

Bin /bin/sh

Sys /bin/sh

If you just display the /etc/passwd account and the shell corresponding to the account, and the account and the shell are separated by commas, and add the column name, shell in all rows, add "blue, /bin/nosh" in the last line.

Cat /etc/passwd |awk -F ':' 'BEGIN {print "name,shell"} {print $1","$7} END {print "blue,/bin/nosh"}'

Name, shell

Root, /bin/bash

Daemon, /bin/sh

Bin, /bin/sh

Sys, /bin/sh

Blue, /bin/nosh

The awk workflow is like this: first execute BEGING, then read the file, read a record with /n newline split, then divide the record by the specified domain separator, fill the field, $0 means all domains, $1 Represents the first field, $n represents the nth field, and then begins the action corresponding to the execution mode. Then start reading the second record... until all the records have been read, and finally the END operation.

Search all lines of the root keyword in /etc/passwd

#awk -F: '/root/' /etc/passwd

Root:x:0:0:root:/root:/bin/bash

This is an example of the use of pattern. A line matching pattern (here is root) will execute the action (no action is specified, the default output is the content of each line).

Search for support regulars, such as looking for root: awk -F: ‘/^root/’ /etc/passwd

Search /etc/passwd has all the lines of the root keyword and display the corresponding shell

# awk -F: '/root/{print $7}' /etc/passwd

/bin/bash

This specifies action{print $7}

Awk built-in variable

Awk has a number of built-in variables for setting environment information. These variables can be changed. Some of the most commonly used variables are given below.

Number of ARGC command line parameters

ARGV command line parameter arrangement

ENVIRON supports the use of system environment variables in the queue

FILENAME awk browse file name

FNR Number of records for browsing files

FS sets the input field separator, equivalent to the command line -F option

The number of fields in the NF browsing record

NR number of records read

OFS output field separator

ORS output record separator

RS control record separator

In addition, the $0 variable refers to the entire record. $1 represents the first field of the current line, $2 represents the second field of the current line, ... and so on

Statistics /etc/passwd: file name, line number of each line, number of columns per line, corresponding full line content:

#awk -F ':' '{print "filename:" FILENAME ",linenumber:" NR ",columns:" NF ",linecontent:"$0}' /etc/passwd

Filename: /etc/passwd, linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bash

Filename: /etc/passwd, linenumber: 2, columns: 7, linecontent: daemon: x:1:1:daemon:/usr/sbin:/bin/sh

Filename: /etc/passwd, linenumber:3, columns:7,linecontent:bin:x:2:2:bin:/bin:/bin/sh

Filename:/etc/passwd, linenumber:4,columns:7,linecontent:sys:x:3:3:sys:/dev:/bin/sh

Use printf instead of print to make the code more concise and easy to read.

Awk -F ':' '{printf("filename:%10s,linenumber:%s,columns:%s,linecontent:%sn",FILENAME,NR,NF,$0)}' /etc/passwd

Print and printf

Awk also provides a print and printf print output function.

The parameters of the print function can be variables, values or strings. Strings must be quoted in double quotes, separated by commas. Without a comma, the parameters are concatenated and cannot be distinguished. Here, the role of the comma is the same as the separator of the output file, except that the latter is a space.

Printf function, its usage is basically similar to printf in c language. It can format strings. When the output is complicated, printf is easier to use and the code is easier to understand.

Awk programming

Variables and assignments

In addition to awk's built-in variables, awk can also customize variables.

The number of accounts in /etc/passwd is counted below.

Awk '{count++;print $0;} END{print "user count is ", count}' /etc/passwd

Root:x:0:0:root:/root:/bin/bash

User count is 40

Count is a custom variable. The previous action{} has only one print. In fact, print is just a statement, and action{} can have multiple statements separated by ;

There is no initialized count here, although the default is 0, but the proper practice is to initialize to 0:

Awk 'BEGIN {count=0;print "[start]user count is ", count} {count=count+1;print $0;} END{print "[end]user count is ", count}' /etc/passwd

[start]user count is 0

Root:x:0:0:root:/root:/bin/bash

[end]user count is 40

Count the number of bytes occupied by files in a folder

Ls -l |awk 'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size}'

[end]size is 8657198

If displayed in M:

Ls -l |awk 'BEGIN {size=0;} {size=size+$5;} END{print "[end]size is ", size/1024/1024,"M"}'

[end]size is 8.25889 M

Note that the statistics do not include subdirectories of the folder.

Conditional statements

The conditional statements in awk are borrowed from the C language, as explained below:

If (expression) {

    Statement

    Statement

}

If (expression) {

    Statement

} else {

    Statement2;

}

If (expression) {

    Statement1;

} else if (expr