I like Perl and I like C (most of the time), and sometimes I like to mix the two. The two main reasons I might want to mix the two is for performance, or because something is written in C which I would like to use from Perl. I’ve only really ever used C from Perl, I’ve never used Perl from C. Today’s demonstration is how to implement a simple binary search algorithm in C, but using Perl internals, and calling the algorithm from Perl. The technologies used in today’s demonstration are Perl, C, and a Perl module called
So, here’s the flow of the code:
- At compile-time the
BinSearch.pmmodule compiles and links in (via
DynaLoader) the C code and any needed libraries
use_binsearch.plscript creates an array of items
use_binsearch.plscript calls the
do_binsearch()function looking for an item in an array
do_binsearch()function sorts the array (using Perl’s built-in mergesort in Perl 5.7+)
do_binsearch()function uses the C function
_c_bin_search()to find the item within the list
_c_bin_search()function performs the binary search using the function
compare()defined in the
__C__block to perform comparisons, keeping in mind that a value in Perl can be a string or a numeric type
&PL_sv_undefif nothing is found, otherwise it returns the value it found
do_binsearch()returns the value returned by the
[gist id=”6443144″ file=”use_binsearch.pl”] Here you can see that I’m exercising my code a little bit. I have an array with mixed types, an array with only one element, an array with no elements, and an array with only numeric elements.
[gist id=”6443144″ file=”BinSearch.pm”] [gist id=”6443144″ file=”BinSearch.pm.c”] Note: In the actual file, the
BinSearch.pm.c code is appended to the bottom of the
BinSearch.pm code. I’m treating it as two separate files in the Gist for syntax highlighting purposes.
The Use of
So the first part of this that you’ll notice is now it’s really weird that the C code is in the same file as the Perl code. This is actually really cool, and there are ways you can remove the code from the Perl into its own file, but I’m not going to cover that here. The most common use – for me at least – of
Inline::C involves having the code in the __DATA__ block, which is a special data segment in Perl code. I am not going to cover what the
__DATA__ segment is, but if you would like more information about this special literal – and others – please See Here.
Inline::C module has all sorts of really cool and useful options. I have used
Inline::C for simple things (like this demonstration), but also for building data feeders for large C applications in Perl.
Inline::C is as simple or as complex as you’re willing to use it.
The largest weakness in
Inline::C is portability.
Inline::C requires a working build environment with compiler and linker, and you have to give it all of the build options you’ll need. As anybody who has dealt with build processes for larger applications will tell you, this is a very platform-specific task. If you’re making a module you need portability with then you should probably pay special attention to that, or maybe find a Pure-Perl way of performing your task.
Now let’s talk about the code. Here you’re going to see a lot of weird symbols flying, here are the fun ones you’ll see:
SV– Scalar Value
AV– Array Value (really an array reference in Perl)
I32– A 32-bit integer
SvNOK()– A macro to determine whether or not a scalar value is a numeric value (includes number types with and without precision)
SvOK()– A macro to determine whether or not a value is actually a valid scalar value to Perl (think
sv_cmp()– An API function which does a string compare in Perl, taking into account that it’s possible that your two values may include scalars holding numeric strings. It is important to note that this will not do a numeric comparison, only a string comparison.
av_len()– An API function which returns the last index of an array (e.g.
av_fetch()– An API function which returns a pointer to a pointer of a scalar value. If the last argument to this function is non-zero then it should always return a valid and usable scalar value. I don’t do that since I want to be able to detect undefined values.
PL_sv_undef– This is
undef. Keep in mind that my function returns an
SV*, so I want to return the address of
PL_sv_undefwhen it is time to
SvREFCNT_inc()– This is an API function which increments the reference counter of a scalar value. Since I am returning the scalar value from a C function, I want to increment this count so it doesn’t release it immediately after my function exits scope. The garbage collector will reap this scalar value later on when the thing that my function returns to releases (or never holds) a reference to the scalar.
die()– These are API implementations of the Perl functions with the same names that you have come to know and love (or not)
Please remember that Perl is a reference-counting garbage-collected language. Messing with the reference count carelessly will certainly result in memory leaks or GC reaping which results in crashes. Additionally, great care must be taken when using this sort of code in threads. Thread safety is a tricky thing even in the best of cases, and this is rather far from the happy path. Over-optimization is very common for people who dabble in
It is important to note that the cross-over from Perl to C comes at a high performance cost, especially when done in looping situations. If you need to send large volumes of data to a C function you really should put it into chunks and then process it in chunks.
It should be remembered that both the first and second Rule of Optimization Club is that you do not talk about optimization. The other two relevant rules are that you only ever address one factor of an optimization problem at a time, and you should keep testing a long as you have to keep testing.
You should never use
Inline::C to solve a performance problem which you have not measured. If you cannot measure a performance problem then you do not have a performance problem (or you do not have the ability to address it).
Inline::C to be very useful, and I think you will too. This is a very simple demonstration of a very large and feature-rich aspect of Perl programming. This is also quite possibly the largest Gist of the Day I have done since I began this effort. Please forgive me when most of the other articles are much smaller than this one.
Here is the link to the full Gist: https://gist.github.com/manchicken/6443144