Extracting archives without making a mess
Problem
It is common practice to place all files related to a single project into a single directory, and then to make an archive of that directory.
This practice ensures that the archive does not make a mess of someone’s working directory when extracted.
Unfortunately, such benevolence cannot always be expected of the archive creator, so care must be taken by the user to prevent a mess.
Approach
Naturally, I automated this process by writing a script which takes care to always extract an archive neatly into its own subdirectory. This script remained as an example for my For each File project for a few years. However, I converted it into Ruby yesterday due to the inability of transforming a relative path into an absolute one in GNU BASH.
Solution
Now, behold the sparkling powers of the extract-archive
script, listed
below and available on GitHub, and enjoy
a tidy working directory. :-)
Before the tool
Consider the two archives illustrated below.
$ tree
.
|-- good_archive.tar
`-- evil_archive.tar
0 directories, 2 files
$ tar tf evil_archive.tar
bar
baz
foo
$ tar tf good_archive.tar
good_archive/
good_archive/bar
good_archive/baz
good_archive/foo
Watch what happens to our working directory as we extract them.
$ tar xf evil_archive.tar
$ tree
.
|-- bar
|-- baz
|-- good_archive.tar
|-- foo
`-- evil_archive.tar
0 directories, 5 files
$ tar xf good_archive.tar
$ tree
.
|-- bar
|-- baz
|-- good_archive
| |-- bar
| |-- baz
| `-- foo
|-- good_archive.tar
|-- foo
`-- evil_archive.tar
1 directory, 8 files
The evil archive carelessly littered its contents within our working directory, whereas the good archive politely placed its contents into its own subdirectory.
After the tool
$ tree
.
|-- good_archive.tar
`-- evil_archive.tar
0 directories, 2 files
$ extract.rb *.tar
extract.rb: `good_archive.tar' ->`./good_archive'
extract.rb: `evil_archive.tar' ->`./evil_archive.tar+1156637503'
$ tree
.
|-- good_archive
| |-- bar
| |-- baz
| `-- foo
|-- good_archive.tar
|-- evil_archive.tar`-- evil_archive.tar+1156637503
|-- bar
|-- baz
`-- foo
2 directories, 8 files
~/bin/extract-archive
#!/usr/bin/env ruby
#
# Extracts various compressed and uncompressed file archives (see
# http://en.wikipedia.org/wiki/List_of_archive_formats) into their
# *own* output directories so that they do not make a mess by
# extracting directly into your working directory.
#
# Written in 2003 by Suraj N. Kurapati <https://github.com/sunaku>
require 'tmpdir'
require 'fileutils'
include FileUtils::Verbose
# Extracts the given source archive relative to the given destination path,
# and returns the path of the directory containing the extracted contents.
def extract src_path, dst_path = File.dirname(src_path)
src_path = File.expand_path(src_path)
src_name = File.basename(src_path)
src_suffix = File.extname(src_name)
src_prefix = File.basename(src_name, src_suffix)
Dir.mktmpdir(nil, dst_path) do |tmp_dir|
# decompress the archive
cd tmp_dir do
case src_name.sub(/\.part$/, '')
when /\.(tar\.gz|tar\.Z|tgz|taz)$/i
system 'tar', '-zxf', src_path
when /\.(tar\.bz|tar\.bz2|tbz|tbz2)$/i
system 'tar', '-jxf', src_path
when /\.(tar\.xz|txz)$/i
system 'tar', '-Jxf', src_path
when /\.(tar|cpio|gem)$/i
system 'tar', '-xf', src_path
when /\.(tar.lzo|tzo)$/i
system "lzop -xc #{src_path.inspect} | tar -xf -"
when /\.(lzo)$/i
system 'lzop', '-x', src_path
when /\.(gz)$/i
system "gunzip -c #{src_path.inspect} > #{src_prefix.inspect}"
when /\.(bz|bz2)$/i
system "bunzip2 -c #{src_path.inspect} > #{src_prefix.inspect}"
when /\.(shar)$/i
system 'sh', src_path
when /\.(7z)$/i
system '7zr', 'x', src_path
when /\.(zip)$/i
system 'unzip', src_path
when /\.(jar)$/i
system 'jar', 'xf', src_path
when /\.(rz)$/i
ln src_path, src_name # rzip removes the archive after extraction
system 'rzip', '-d', src_name
when /\.(rar)$/i
system 'unrar', 'x', src_path
when /\.(ace)$/i
system 'unace', 'x', src_path
when /\.(arj)$/i
system 'arj', 'x', src_path
when /\.(arc)$/i
system 'arc', 'x', src_path
when /\.(lhz|lha)$/i
system 'lha', 'x', src_path
when /\.(a|ar)$/i
system 'ar', '-x', src_path
when /\.(Z)$/
system "uncompress -c #{src_path.inspect} > #{src_prefix.inspect}"
when /\.(z)$/
system "pcat #{src_path.inspect} > #{src_prefix.inspect}"
when /\.(zoo)$/i
system 'zoo', 'x//', src_path
when /\.(cab)$/i
system 'cabextract', src_path
when /\.(deb)$/i
system 'ar', 'x', src_path
when /\.(rpm)$/i
system "rpm2cpio #{src_path.inspect} | cpio -i --make-directories"
else
warn "I do not know how to extract #{src_path.inspect}"
end
end
# clean any mess made by decompression
manifest = Dir.new(tmp_dir).entries - %w[ . .. ]
if manifest.length == 1 # there was no mess!
adj_dst = File.join(dst_path, manifest.first)
adj_src = File.join(tmp_dir, manifest.first)
else
adj_src = tmp_dir
adj_dst = File.join(dst_path, src_name[/.*(?=\..*?)/])
end
adj_dst << "+#{Time.now.to_i}" until
not File.exist? adj_dst and
mv(adj_src, adj_dst, :force => true)
touch tmp_dir # give Dir.mktmpdir() something to remove
adj_dst
end
end
if $0 == __FILE__
prefix = File.basename(__FILE__)
ARGV.each do |src|
dst = extract(src)
puts "#{prefix}: '#{src}' => '#{dst}'"
end
end