Saturday, October 1, 2016

min_fgrep: minimal fgrep command in D

By Vasudev Ram

min_fgrep pattern < file

The Unix fgrep command finds fixed-string patterns (i.e. not regular expression patterns) in its input and prints the lines containing them. It is part of the grep family of Unix commands, which also includes grep itself, and egrep.

All three of grep, fgrep and egrep originally had slightly different behaviors (with respect to number and types of patterns supported), although, according to the Open Group link in the previous paragraph, fgrep is marked as LEGACY. I'm guessing that means the functionality of fgrep is now included in grep, via some command-line option, and maybe the same for egrep too.

Here is a minimal fgrep-like program written in D. It does not support reading from a filename given as a command-line argument, in this initial version; it only supports reading from stdin (standard input), which means you have to either redirect its standard input to come from a file, or pipe the output of another command to it.
/*
File: min_fgrep.d
Purpose: A minimal fgrep-like command in D.
Author: Vasudev Ram
Copyright 2016 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: http://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
*/

import std.stdio;
import std.algorithm.searching;

void usage(string[] args) {
    stderr.writeln("Usage: ", args[0], " pattern");
    stderr.writeln("Prints lines from standard input that contain pattern.");
}

int main(string[] args) {
    if (args.length != 2) {
        stderr.writeln("No pattern given.");
        usage(args);
        return 1;
    }

    string pattern = args[1];
    try {
        foreach(line; stdin.byLine)
        {
            if (canFind(line, pattern)) {
                writeln(line);
            }
        }
    } catch (Exception e) {
        stderr.writeln("Caught an Exception. ", e.msg);
    }
    return 0;
}
Compile with:
$ dmd min_fgrep.d
I ran it with its standard input redirected to come from its own source file:
$ min_fgrep < min_fgrep.d string
and got the output:
void usage(string[] args) {
int main(string[] args) {
    string pattern = args[1];
(Bold style added by me in the post, not part of the min_fgrep output, unlike with some greps.)

Running it as part of a pipeline:

$ type min_fgrep.d | min_fgrep string > c
gave the same output. The interesting thing here is not the min_fgrep program itself - that is very simple, as can be seen from the code. What is interesting is that the canFind function is not specific to type string; instead it is from the std.algorithm.searching module of the D standard library (Phobos), and it is generalized - i.e. it works on D ranges, which are a rather cool and powerful feature of D. This is done using D's template metaprogramming / generic programming features. And since the version of the function specialized to the string type is generated at compile time, there is no run-time overhead due to genericity (I need to verify the exact details, but I think that is what happens).

- Vasudev Ram - Online Python training and consulting

Get updates on my software products / ebooks / courses.

Jump to posts: Python   DLang   xtopdf

Subscribe to my blog by email

My ActiveState recipes

FlyWheel - Managed WordPress Hosting


No comments: