Category Archives: Advanced

Nuix 101: Replace an Encrypted Zip

This video will show you how to replace an encrypted zip file in your Nuix Case.

To see more Nuix how to videos visit their Youtube playlist here: https://www.youtube.com/playlist?list=PL63768553A2B1803F

#CommandLineFu: 8 Deadly Commands You Should Never Run on Linux

Being a command line junky, I thought I’d share this fun little article I found the other day from @chrisbhoffman – 8 Deadly Commands You Should Never Run on Linux.

rm -rf / – Deletes Everything!

:(){ :|: & };: – Fork Bomb

mkfs.ext4 /dev/sda1 – Formats a Hard Drive

command > /dev/sda – Writes Directly to a Hard Drive

dd if=/dev/random of=/dev/sda – Writes Junk Onto a Hard Drive

mv ~ /dev/null – Moves Your Home Directory to a Black Hole

wget http://example.com/something -O – | sh – Downloads and Runs a Script

Chris Hoffman is a technology writer and all-around computer geek. He’s as at home using the Linux terminal as he is digging into the Windows registry. Connect with him on Google+.

CommandLineFu: File/Byte Count of Folder List

 while read -r dir; do echo -n `du -hsb "$dir"` ; echo "|"`find "$dir" -type f | wc -l` ; done < UD_input.txt | tee UserData.log

 

Uses text file with one directory per line as input and prints:

BYTES [space] DIRECTORY NAME [pipe] FILE COUNT

1842531456 ./FD99_UserShare/z/Bond/|213

CommandLineFu: FileType Report w/ Dates

More find magic — this little beast takes a while to run on large directories but is worth it’s weight in gold.  Now I just need a way to convert Epoch time to YYYYMMDD format inline. (New project)

find ./foo/ -type f -printf '%f|%h|%s|%AY%Am%Ad|%TY%Tm%Td|' -exec stat --printf "%W|" '{}' \; -exec file -bp '{}' \; > bar.log

%f == file name without leading directories

%h == leading directories without file name

%s == size in bytes

%A = Last access time (Y,m,D = YYYYMMDD format)

%T == Modification time (Y,m,D = YYYYMMDD format)

(using printf keeps everything on the same line)

stat %W == file birth date in Epoch time

file -bp == checks file type, b==brief, p==preserve date

CommandLineFu: FOR LOOP – report file count in pwd/* && print disk usage

#!/bin/bash
for i in */ ;
    do
        echo -n "$i:" >> "/path/to/some/file/already/created.txt" ;
        find "$i" -type f | wc -l >> "/path/to/same/file/already/created.txt" ;
        du -hs "$i" ;
    done
exit 0

Quick and dirty…
Actually this is quite slow when dealing with directories containing thousands of little files.
But it gets the job done.  I’ll play around with it and see if forking helps.

CommandLineFu: Read list of filenames – test if they exist

while read -r file; do if [[ ! -e $file ]]; then echo "$file|error" ; fi ; done < input.txt

I’ll usually pipe the output to tee so I can watch what’s going on:

while read -r file; do if [[ ! -e $file ]]; then echo "$file|error" ; fi ; done < input.txt | tee error_report.txt

CommandLineFu: Split a Text file based on line numbers

Assumptions: I have a text file that contains 25+ Million lines, I want to split them into 100,000 line text files.

#!/bin/bash
x=1
y=100000
z=1
while [ $z -le 26 ]
do
    sed -n "$x,${y}p;${y}q;" tbl_001.txt > "t$z.txt"
    x=$(( $x + 100000 ))
    y=$(( $y + 100000 ))
    z=$(( $z + 1 ))
done

If you want to split by a different amount change the “y” variable, and the + whatever number to the number of lines you want.

The “z” variable is used as the filename, and the cutoff point.  If my original file only had one million lines I would change the “while” condition to 10 instead of 26.

I’m sure there’s a way to have the machine do the math for me.  But I don’t have the patience to hunt down how to do this right now.  I imagine it would have something to do with storing the line count (wc -l) in a variable, prompting the end user for the max line count (read $maxcount), and looping until the file is completely done.  (Not sure how to do this last part).  A project for another day.

CommandLineFu: Basic Filetype Report

A little more find magic:

find . -type f -printf '%d|%k|%f|' -exec file -F"|" -pi '{}' \;

Explanation:

find .

Start searching within the working directory.

-type f

Return only files, not directories

-printf '%d|%k|%f|'

Output tree depth, then a pipe, file size in bytes, then a pipe, then the filename without the leading directories, and a pipe.

-exec file -F"|" -pi

Execute “file” command on file found using a pipe delimiter (-F”|”)instead of the default “:” colon, additionally do not change last access times (-p), and output the file type in mime format (-i).

for dir in */; do a=${dir%%,*} ; find "$dir" -type f -printf '%d|%k|%f|' -exec file -F'|' -pi '{}' \; > "a$"_LIST.TXT & done ; wait

I use the above FOR loop when I have many folders that I need to generate a report for.

In this case all of my top level folders are named after a person “Last, First”, so I create a variable cutting everything after the first comma. I use that variable (the last name) to create an empty text file.  The results of the find command are appended to this text file.

The only obvious pitfall is if two folders have the same value before the comma.  Or two different people have the same last name.

CommandLineFu: Create Allegro Export Report

for i in {001..999} ; do a=`find ./$i/Native/ -type f 2>>/dev/null | wc -l` 
&& b=`find ./$i/Text/ -type f 2>>/dev/null/ | wc -l` 
&& c=$(($( wc -l ./Data/loadfile.txt 2>>/dev/null | awk '{ print $1 }')-1)) 
&& echo -n "$i," && echo "$a,$b,$c" ; done

Assumes all Allegro exports are located in the same top level directory, and follow a simple alpha-numeric sequence.

In the future I plan on building a simple script around this that prompts for a top-level directory so it can be run from anywhere on the network.  Additionally, storing the report to a text file and running subsequent calculations to determine any time a line item doesn’t match perfectly between Native, Text and Record count.

Allegro Export FOR loop

CommandLineFu: AWK – comma quote delimited file

awk -F'^"|","|"$' '{ OFS="|" ; print $2,$3,$4 }' somefile.txt

Use awk to parse a comma delimited file.

OFS=”|” outputs to a pipe delimited file.

Add one to all column references, $1 = $2, $2 = $3, etc.

CommandLineFu: Bash Netcat Copy

Have you ever had to copy millions upon millions of little files across your network very, very quickly? Have you exhausted all of your other command line hacks yet? Of course you have, or you wouldn’t be reading this. (Or you’re my mom.)

Ok… I get that the audience for this type of thing is rather limited. But this is one of those posts that will get more hits by me, then anybody else. This is strictly for demonstrating how to send thousands or millions of little files across a network using bash, tar, and netcat. (Mom you can stop reading now.. I don’t make a cameo in the video… you can skip this one. Thanks for the click though.)

The Code: One-liners are a beautiful thing

##Talking Box

tar -cz [source_dir] | nc [destination_ip] [destination_port]

##Listening Box

nc -l -p [local port] | tar -C [destination_dir] -xzf -

The setup:
Running cygwin-X from one of my XP boxes I tunneled into two different linux boxes (krispc7 == ubuntu 11.04 && kris@bt == backtrack 5).

ssh -X kris@krispc7
ssh -X kris@bt

I did this to work entirely within a native linux environment mainly because I’ve only ever done this with cygwin in the past. Also, so I can demonstrate everything on the same screen using terminator (my favorite GUI shell) and not have to run multiple desktop recorders. I don’t actually need the X forwarding, and I’m sure that my performance was lacking because of this. Additionally the files being copied were on a separate windows file server (we’ll call him e5). So that throws the whole speed thing out the window. Combine that with my extremlely verbose switches and you could probably print the files out of one machine, and physically scan them into the other machine quicker than the actual copy process took place.

Like I said… for demonstration purposes only. Running the compression and netcat instance on a third party machine is just plain stupid in this situation if you’re trying to move stuff really fast (not to mention that this particular hack box has no legs at all). The ideal environment would be to run the talkie box command on the actual talking box.

Moving on…

I ssh into bt (i know everyone roots into their bt boxes… but I don’t allow root to ssh anything), go to the network shared directory on e5 that contains the subfolders with the millions of little files and initiate the talkie side of the command. I then ssh into krispc7 and initiate the listening side of the command.

…Actually it’s the other way around… but you get the idea (“YOU”, is me talking to myself in my own post. Now I’m omnisciently referring to myself in the third person twice removed… and you thought you had problems.)

So listening box is listening, and talking box is waiting for me to hit enter. In the bottom right of the video is a simple while loop I used to count the number of new files in the destination directory.

I let the copy/nc job run for about ten minutes before I killed the video. But I cut a lot out while editing… so 10 minutes happens in less than two. (Who really wants to watch a video of files being copied?).

What’s happening here you ask?
Each file is being compressed on (what is supposed to be) the local machine and instead of being output to an individual zip or tarball file I’m simply redirecting the compressed data into netcat which sends the information over a tcp connection pointed at a specific port. The listening box in turn is monitoring the port defined (9998 in my video) for any and all incoming data and redirects it to be decompressed in the output location of choice.

Maybe tomorrow I’ll run a test that involves copying a bunch of stuff back and forth between two high-end machines (without any man in the middle), and compare the speeds when using different types of compression. Then compare those to a standard scp, windows drag and drop file copy, and my favorite… xxcopy.

Until then, enjoy the show. (Always launch the videos in full screen to watch in HD).