Pretending python is a shell¶
We all like python for scripting, because it’s so much more powerful than a shell. But sometimes we really need to call a shell command because it’s so much easier than writing yet another library in python or adding a dependency:
from whelk import shell
shell.zgrep("-r", "downloads", "/var/log/httpd")
# Here goes code to process the log
You can even pipe commands together:
from whelk import pipe
pipe(pipe.getent("group") | pipe.grep(":1...:"))
Installing¶
Installing the latest released version is as simple as:
pip install whelk
If you want to tinker with the source, you can install the latest source from github:
git clone https://github.com/seveas/whelk.git
Calling a command¶
The whelk.shell
object can be used to call any command on your
$PATH
that is also a valid python identifier. Since many commands
contain a “-“, it will find those even if you spell it with a “_”. So e.g.
run-parts
can be found as shell.run_parts()
.
If your command is not valid as a python identifier, even after substituting
dashes for underscores, you can using the shell
object as a dict. This
dict also accepts full paths to commands, even if they are not on your
$PATH
.
Attributes of the shell
instance are all callables. Arguments to this
callable get mapped to arguments to the command via a subprocess.Popen
object. Keyword arguments get mapped to keyword arguments for the
Popen
object:
result = shell.netstat('-tlpn')
result = shell.git('status', cwd='/home/dennis/code/whelk')
result = shell['2to3']('awesome.py')
result = shell['./Configure']('-des', '-Dusedevel')
Oh, and on windows you can leave out the exe
suffix, like you would on
the command line as well:
result = shell.nmake('test')
Shell commands return a namedtuple (returncode, stdout, stderr)
These
result objects can also be used as booleans. As in shellscript, a non-zero
returncode is considered False
and a returncode of zero is considered
True
, so this simply works:
result = shell.make('test'):
if not result:
print("You broke the build!")
print(result.stderr)
The result of pipe(...)
is slightly different: instead of a single return
code, it actually will give you a list of returncodes of all items in the
pipeline. Result objects like this are only considered True
if all
elements are zero.
Keyword arguments¶
In addition to the subprocess.Popen
arguments, whelk supports a few
more keyword arguments:
input
Contrary to the
subprocess
defaults,stdin
,stdout
andstderr
are set towhelk.PIPE
by default. Input for the command can be passed as theinput
keyword parameter.Some examples:
result = shell.cat(input="Hello world!") result = shell.vipe(input="Some data I want to edit in an editor")
output_callback
To process output as soon as it arrives, specify a callback to use. Whenever output arrives, this callback will be called with as arguments the shell instance, the subprocess, the filedescriptor the data came in on, the actual data (or
None
in case of EOF) and any user-specified arguments . Here’s an example that uses this feature for logging:def cb(shell, sp, fd, data, extra=""): if data is None: logging.debug("%s<%d:%d> File descriptor closed" % (extra, sp.pid, fd)) for line in data.splitlines(): logging.debug("%s<%d:%d> %s" % (extra, sp.pid, fd, line)) shell.dmesg(output_callback=cb) shell.mount(output_callback=[cb, "Mountpoints: "])
raise_on_error
This makes your shell even more pythonic: instead of returning an errorcode, a
CommandFailed
exception is raised whenever a command returns with a nonzero exitcode.The reason this is not the default, is that for quite a few commands a non-zero exitcode, does not indicate an error at all. For example, the venerable
diff
command returns 1 if there is a change and 0 if there is none.exit_callback
If you want slightly more fine-grained control than
raise_on_error
, you can use this argument to specify a callable to call whenever a process exits, irrespective of the returncode. The callback will be called with as arguments the command instance, the subprocess, the result tuple and any user-provided arguments.Both
raise_on_exit
andexit_callback
are most useful when set as a default of aShell
instance, they are not really needed when calling single commands.Here’s a real life example of an exit callback, which will retry git operations when the break due to repository locks:
def check_sp(command, sp, res): if not res: if 'index.lock' in res.stderr: # Let's retry time.sleep(random.random()) return command(*command.args, **command.kwargs) raise RuntimeError("%s %s failed: %s" % (command.name, ' '.join(command.args), res.stderr)) git = Shell(exit_callback=check_lock).git git.checkout('master')
run_callback
A function that will be called whenever the shell instance is about to create a new process. The callback will be called with as arguments the command instance and any user-provided arguments. Here’s an example that logs all starts of applications:
def runlogger(cmd): args = [cmd.name] + list(cmd.args) env = cmd.sp_kwargs.get('env', '') if env: env = ['%s=%s' % (x, env[x]) for x in env if env[x] != os.environ.get(x, None)] env = '%s ' % ' '.join(env) logger.debug("Running %s%s" % (env, ' '.join(args))) shell = Shell(run_callback=runlogger)
Piping commands together¶
The whelk.pipe
object is similar to the shell
object but has
a few significant differences:
pipe
commands can be chained with|
(binary or), resembling a shell pipe.pipe
takes care of the I/O redirecting.The command is not started immediately, but only when wrapping it in another
pipe()
call (yes, the object itself is callable), or chaining it to the next.In the result tuple, the returncode is actually a list of returncodes of all the processes in the pipe, in the order they are executed in.
The only I/O redirection you may want to override is
stderr=whelk.STDOUT
, orstderr=open('/dev/null', 'w')
to redirectstderr
of a process tostdin
of the next process, or/dev/null
respectively.
Some examples:
result = pipe(pipe.dmesg() | pipe.grep('Bluetooth'))
cow = random.choice(os.listdir('/usr/share/cowsay/cows'))
result = pipe(pipe.fortune("-s") | pipe.cowsay("-n", "-f", cow))
Setting default arguments¶
If you want to launch many commands with the same parameters, you can set
defaults by passing parameters to the Shell
constructor. These are
passed on to all commands launched by that shell, unless overridden in specific
calls:
from whelk import Shell
my_env = os.environ.copy()
my_env['http_proxy'] = 'http://webproxy.corp:3128'
shell = Shell(stderr=Shell.STDOUT, env=my_env, encoding='utf8')
shell.wget("http://google.com", "-o", "google.html")
Python compatibility¶
Whelk is compatible with python 3.4 and up, python 2 is no longer supported. If you find an incompatibility, please report a bug at https://github.com/seveas/whelk.