Ask Your Question
2

How can a shell script be used to divide a large file into segments based on their length in bytes?

asked 2021-09-26 11:00:00 +0000

pufferfish gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2021-12-29 20:00:00 +0000

david gravatar image

One approach to dividing a large file into segments based on their length in bytes using a shell script could be:

  1. Determine the total size of the file using the stat command:

    filesize=$(stat -c "%s" largefile.txt)

  2. Define a segment size in bytes:

    segsize=1000000 # e.g. 1MB

  3. Calculate the number of segments required to process the entire file:

    numsegs=$(expr $filesize / $segsize)

  4. Use a loop to iterate over each segment and process it:

for ((i = 0; i <= $numsegs; i++))
do
    startpos=$(expr $i \* $segsize)
    endpos=$(expr $startpos + $segsize)
    if [ $endpos -gt $filesize ]
    then
        endpos=$filesize
    fi
    segment=$(printf "%s_%05d" "largefile.txt" $i)
    dd if=largefile.txt of=$segment bs=1 skip=$startpos count=$(expr $endpos - $startpos)
done

In this script, the loop iterates over each segment by calculating the start and end positions for each segment. Then, it uses the dd command to create a new file for each segment by extracting the specified number of bytes from the input file. The output file name is created using a format string to include the original filename and segment index. Finally, the if condition checks if the end position exceeds the file size and adjusts it accordingly for the last segment.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2021-09-26 11:00:00 +0000

Seen: 9 times

Last updated: Dec 29 '21