This content originally appeared on DEV Community and was authored by Yawar Amin
IT IS well known by now that the Cmdliner package is the way to go for powerful, pure command-line argument parsing in OCaml, unless you are already in the Jane Street ecosystem.
But what about the OCaml standard library's own command-line parsing module, Arg
? The conventional wisdom is that it uses mutation, so it's not pure functional and hence possibly unsafe. But this is not necessarily the case. It is fairly easy to wrap up the usage of Arg
in such a way that it becomes purely functional and perfectly safe.
If you have relatively simple command-line parsing needs, then it may be quicker and simpler to just use the built-in module rather than pulling in yet another dependency. Let's look at an example.
The example CLI
Say we have the following CLI:
$ ./cmdarg -help
cmdarg [-w <time>] [-r <repeat>] <msg>
Prints <msg> out to standard output.
-w Time in seconds to wait before printing the message [default 0]
-r How many times to print the message [default 1]
-help Display this list of options
--help Display this list of options
We will implement this in two steps:
- Define a record type to represent the fully-parsed values from the command line
- Define a
parse
function to actually parse the command line and return a record filled with the correct values
Here's the annotated code:
(* cmdarg.ml *)
module Cmd = struct
type t = { wait : int; repeat : int; msg : string }
Note, all three fields are required. We have defaults for two of them, and the other one must be provided by the user.
let usage = "cmdarg [-w <time>] [-r <repeat>] <msg>
Prints <msg> out to standard output.
"
The rest of the usage message will be printed by the Arg
module's parser function itself.
let parse () =
let wait = ref 0 in
let repeat = ref 1 in
let msg = ref None in
We set up the mutable variables inside the Cmd.parse
function, where they are not visible to callers. This is the key to making the whole thing pure.
let specs = [
"-w", Arg.Set_int wait, "Time in seconds to wait before printing the message [default 0]";
"-r", Set_int repeat, "How many times to print the message [default 1]";
]
A list of 3-tuples that describes the command-line options. The design is quite clever, it takes the mutable refs and sets them as it parses the command line and comes across the corresponding options. If the option is not found, then the ref stays at its default value.
in
let anon str = msg := Some str in
What to do when we come across an 'anonymous' argument, i.e. one not preceded by an option. in this case, wrap it in Some
and assign that to the msg
ref.
Arg.parse specs anon usage;
Here is where the mutation happens. It takes the specs, the anonymous argument handler, and the usage message and fills up the previously-defined refs as needed.
{
wait = !wait;
repeat = !repeat;
msg = match !msg with
| Some m ->
m
| None ->
Arg.usage specs usage;
invalid_arg "<msg> is required";
}
Here we create and return the actual record value, by grabbing the values set in the refs after parsing is finished. For msg
, since it's wrapped in an option
, we need to extract it and error out if the user didn't provide it. If the anonymous argument had been actually optional, we could have provided a default, or a list to handle more than one option.
Note the critical thing here is we never expose the internal refs to the function caller, only the pure immutable record value created after parsing the command line and filling up the refs. When a function has internal mutation but the callers can't observe it, it is a pure function.
end
let () =
let { Cmd.wait; repeat; msg } = Cmd.parse () in
for _ = 1 to repeat do
Unix.sleep wait;
print_endline msg
done
Finally, we call Cmd.parse
and get the parsed command-line options in a nice, convenient record value that we destructure and use. Compile and test with:
ocamlopt -o cmdarg unix.cmxa cmdarg.ml
./cmdarg
Conclusion
This was a pretty simple example, but you can actually do some pretty sophisticated parsing, with some care. E.g., imagine getting hostnames and port numbers on the command line and parsing them internally into Unix.sockaddr
addresses, so that it's super convenient to open sockets.
It's probably possible to handle even more complex scenarios, e.g. subcommands like git
, but in my opinion at that point it's probably worth using Cmdliner.
This content originally appeared on DEV Community and was authored by Yawar Amin
Yawar Amin | Sciencx (2021-09-12T03:19:03+00:00) Quick-and-dirty pure command-line arguments in OCaml. Retrieved from https://www.scien.cx/2021/09/12/quick-and-dirty-pure-command-line-arguments-in-ocaml/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.