CommandLineFu: Split a Text file based on line numbers

Assumptions: I have a text file that contains 25+ Million lines, I want to split them into 100,000 line text files.

#!/bin/bash
x=1
y=100000
z=1
while [ $z -le 26 ]
do
    sed -n "$x,${y}p;${y}q;" tbl_001.txt > "t$z.txt"
    x=$(( $x + 100000 ))
    y=$(( $y + 100000 ))
    z=$(( $z + 1 ))
done

If you want to split by a different amount change the “y” variable, and the + whatever number to the number of lines you want.

The “z” variable is used as the filename, and the cutoff point.  If my original file only had one million lines I would change the “while” condition to 10 instead of 26.

I’m sure there’s a way to have the machine do the math for me.  But I don’t have the patience to hunt down how to do this right now.  I imagine it would have something to do with storing the line count (wc -l) in a variable, prompting the end user for the max line count (read $maxcount), and looping until the file is completely done.  (Not sure how to do this last part).  A project for another day.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s