Subject: grep awk strings uniq
exercise to create files to practice using grep
As part of presenting this material in training, the students would have created several text files based
on the man pages. The following script would ahve been created to provide material to exercise commands on.
Open a file using vi and cut and paste or retype this information into it and save it:
#!/bin/bash
##########################################################
## simple script with for loop
###########################################################
TOOL="dmesg bootloader ldd ldconfig dpkg dpkg-reconfigure apt-get
apt-cache aptitude rpm rpm2cpio yum yumdownloader bash echo env
exec export pwd set unset man uname history cat cut expand fmt
head od join nl paste pr sed sort split tail tr unexpand uniq
wc cp find mkdir mv ls rm rmdir touch tar cpio dd file gzip
gunzip bzip2 tee xargs bg fg jobs kill nohup ps top free uptime
killall nice ps renice top grep egrep fgrep sed regex vi fdisk
mkfs mkswap du df fsck e2fsck mke2fs debugfs dumpe2fs tune2fs
mount umount quota edquota repquota quotaon chmod umask chown
chgrp find locate updatedb whereis which type" ; export TOOL
#
for x in $TOOL
do
man $x | col -b > man.$x.txt
done
ls -al
############################################## end of script ####
Once you've created and executed the script above, then you can experiment with the following grep, awq and uniq commands.
grep -r "the" *.txt | awk -F : '{print $1}' | uniq
To do a recursive grep for the word "the" in the current directory and all sub-directories
there were 60 occurrences within the files, Using "awk" to parse only the file names first, then using "uniq" to select only one occurrence...
results in 8 files that met the test (out of 72,654 files)... response was pretty quick.
--> grep -r the *.txt | awk -F : '{print $1}' | wc -l
60
--> grep -r the *.txt | awk -F : '{print $1}' | uniq | wc -l
8
the directory had over 72,654 files
-->ls -alR | wc -l
72654
If you have binary files, such as tar, zip, exe or commands and you want to see if there are any particular ASCII characters in those files, such
as a command passed through the program. This is helpful if you have a suspected Microsoft virus in an executable, you'd combine "strings" with grep
to search for things like commands dealing with the Microsoft Registry, such as "RegOpenKey".
Here's an actual example from a NETSKY virus infected file, used "strings" to see the commands:
partial extract of virus using "strings" command in Linux.
Do NOT attempt this on a Microsoft Operating System...
notice the *.dll files called out... bad news...
NETSKY virus - see: http://www.sarc.com
========================
strings your_archive.pif | tee -a VIRUS-your_archive.pif-EXTRACT.txt
NOTE: (edited out gibberish using vi)
========================
output of binary file from scripts --> Rich
output of binary file from scripts --> Compressed by Petite (c)1999 Ian Luck.
output of binary file from scripts --> .petite
output of binary file from scripts --> ERROR!
output of binary file from scripts --> Corrupt Data!
output of binary file from scripts --> XXXZt
output of binary file from scripts --> MessageBoxA
output of binary file from scripts --> wsprintfA
output of binary file from scripts --> ExitProcess
output of binary file from scripts --> LoadLibraryA
output of binary file from scripts --> GetProcAddress
output of binary file from scripts --> VirtualProtect
output of binary file from scripts --> InternetGetConnectedState
output of binary file from scripts --> GetNetworkParams
output of binary file from scripts --> RegOpenKeyA
output of binary file from scripts --> USER32.dll
output of binary file from scripts --> KERNEL32.dll
output of binary file from scripts --> WININET.dll
output of binary file from scripts --> WS2_32.dll
output of binary file from scripts --> iphlpapi.dll
output of binary file from scripts --> ADVAPI32.dll
========================
ORIGINAL EMAIL IN PINE:
========================
From: a very sorry Microsoft user
Parts/Attachments:
1 Shown 1 lines Text (charset: Unknown)
2 17 KB Application
----------------------------------------
[ The following text is in the "Windows-1252" character set. ]
[ Your display is set for the "iso-8859-1" character set. ]
[ Some characters may be displayed incorrectly. ]
Your file is attached.
[ Part 2, Application/OCTET-STREAM (Name: "your_archive.pif") 23KB. ]
[ Cannot display this part. Press "V" then "S" to save in a file. ]
in this example one could search for "dll":
strings your_archive.pif | grep dll
output of binary file from scripts --> USER32.dll
output of binary file from scripts --> KERNEL32.dll
output of binary file from scripts --> WININET.dll
output of binary file from scripts --> WS2_32.dll
output of binary file from scripts --> iphlpapi.dll
output of binary file from scripts --> ADVAPI32.dll
strings your_archive.pif | grep Reg
output of binary file from scripts --> RegOpenKeyA
-- Linux commands, scripts, tools and systems administration --
|