You should now be aware of some the powers of awk and how immense the language structure is in itself. The data we have been able to produce from the 30,000 line file is truly powerful and easily extracted. We just need to replace the field we have used before with $1. This field represents the client IP address. If we make use of the following code, we will be able to print each IP Address and also the number of times it has been used to access the web server:
{ ip[$1]++ } END { for (i in ip) print i, " has accessed the server ", ip[i], " times." }
We want to be able to extend this to show only the highest ranking IP address, the address that has been used the most to access the site. The work, again, will mainly be in the END block and will make use of a comparison against the current highest ranking address. The following file can be created and saved as ip.awk:
{ ip[$1]++ } END { for (i in ip) if ( max < ip[i] ) { max = ip[i] maxnumber = i } print i, " has accessed ", ip[i], " times." }
We can see the output of the command in the following screenshot. Part of the client IP address has been obscured as it is from my public web server:
The functionality of the code comes from within the END block. On entering the END block, we run into a for loop. We iterate through each entry in the ip array. We use the conditional if statement to see whether the current value that we are iterating through is higher than the current maximum. If it is, this becomes the new highest entry. When the loop has finished, we print the IP address that has the highest entry.