您的位置:首页 > 移动开发

Embedding Python in C++ Applications with boost::python

2011-09-24 19:31 656 查看
Posted onJune 9,
2011 byjoseph.turner

In the
Introduction to this tutorial series, I took at look at the motivation for integrating Python code into theGranola code base. In short, it allows me to leverage all the benefits of the Python language and the Python standard
library when approaching tasks that are normally painful or awkward in C++. The underlying subtext, of course, is that I didn’t have to port any of the existing C++ code to do so.

Today, I’d like to take a look at some first steps at using boost::python to embed Python in C++ and interact with Python objects. I’ve put all the code from this section ina github repo,
so feel free to check the code out and play along.

At it’s core, embedding Python is very simple, and requires no C++ code whatsoever – the libraries provided with a Python distributioninclude C bindings. I’m going to skip over all that though,
and jump straight into using Python in C++ via boost::python, which provides class wrappers and polymorphic behavior much more consistent with actual Python code than the C bindings would allow. In the later parts of this tutorial, we’ll cover a few things
that you can’t do with boost::python (notably, multithreading and error handling).

So anyway, to get started you need to
download and build boost, or retrieve a copy from your package manager. If you choose to build it, you can build just the boost::python library (it is unfortunately not header-only), though I would suggest getting familiar with the entire set of libraries
if you do a lot of C++ programming. If you are following along with the git repo, make sure you change the path in the Makefile to point to your boost installation directory. And thus concludes the exposition. Let’s dive in!

First, we need to be able to build an application with Python embedded. With gcc this isn’t too difficult; it is simply a matter of including boost::python and libpython as either static or shared libraries. Depending on how you build boost, you may have
trouble mixing and matching. In the tutorial code on github, we will use the static boost::python library (libboost_python.a) and the dynamic version of the Python library (libpython.so).

One of the soft requirements I had for my development efforts at MiserWare was to make the environment consistent across all of our support operating systems: several Windows and an ever-changing list of Linux distros. As a result, Granola links against
a pinned version of Python and the installation packages include the Python library files required to run our code. Not ideal, perhaps, but it results in an environment where I am positive our code will run across all supported operating systems.

Let’s get some code running. You’ll need to include the correct headers, as you might imagine.

Py_Initialize();
py::object main_module = py::import("__main__");
py::object main_namespace = main_module.attr("__dict__");


Note that you must initialize the Python interpreter directly (line 1). While boost::python greatly eases the task of embedding Python, it does not handle everything you need to do. As I mentioned above, we’ll see some more shortcomings in future sections
of the tutorial. After initializing, the
__main__
module is imported and the namespace is extracted. This results in a blank canvas upon which we can then call Python code, adding modules and variables.

boost::python::exec("print 'Hello, world'", main_namespace);
boost::python::exec("print 'Hello, world'[3:5]", main_namespace);
boost::python::exec("print '.'.join(['1','2','3'])", main_namespace);


The
exec
function runs the arbitrary code in the string parameter within the specified namespace. All of the normal, non-imported code is available. Of course, this isn’t very useful without being able to import modules and extract values.

boost::python::exec("import random", main_namespace);
boost::python::objectrand = boost::python::eval("random.random()", main_namespace);
std::cout << py::extract<double>(rand) << std::endl;


Here we’ve imported the
random
module
by executing the corresponding Python statement within the
__main__
namespace, bringing the module into the namespace. After the module
is available, we can use functions, objects, and variables within the namespace. In this example, we use the
eval
function, which returns the result of the passed-in Python statement, to create a boost::python object containing a random value as
returned by the
random()
function in the
random
module. Finally, we extract the value as a C++
double
type and print it.

This may seem a bit.. soft. Calling Python by passing formatted Python strings into C++ functions? Not a very object-oriented way of dealing with things. Fortunately, there is a better way.

boost::python::object rand_mod = boost::python::import("random");
boost::python::object rand_func = rand_mod.attr("random");
boost::python::object rand2 = rand_func();
std::cout << boost::python::extract(rand2) << std::endl;


In this final example, we import the
random
module, but this time using the boost::python
import
function, which loads the module into a boost Python object. Next, the
random
function object is extracted from the
random

module and stored in a boost::python object. The function is called, returning a Python object containing the random number. Finally, the double value is extracted and printed. In general, all Python objects can be handled in this way – functions, classes,
built-in types.

It really starts getting interesting when you start holding complex standard library objects and instances of user-defined classes. In thenext
tutorial, I’ll take a full class through its paces and build a bona fide configuration parsing class around the
ConfigParser
module discuss the details of parsing Python exceptions from C++ code.

In
Part 1, we took a look at embedding Python in C++ applications, including several ways of calling Python code from your application. Though I earlier promised a full implementation of a configuration parser in Part 2, I think it’s more constructive to take
a look at error parsing. Once we have a good way to handle errors in Python code, I’ll create the promised configuration parser in Part 3. Let’s jump in!

If you got yourself a copy of the
git repo for the tutorial and were playing around with it, you may have experienced the way
boost::python
handles Python errors – the
error_already_set

exception type. If not, the following code will generate the exception:

namespacepy = boost::python;
...
Py_Initialize();
...
py::object rand_mod = py::import("fake_module");


…which outputs the not-so-helpful:

terminate called after throwing an instance of 'boost::python::error_already_set'
Aborted

In short, any errors that occur in the Python code that
boost::python
handles will cause the library to raise this exception; unfortunately, the exception does not encapsulate any of the information about the error itself. To extract information
about the error, we’re going to have to resort to using the Python C API and some Python itself. First, catch the error:

try{
Py_Initialize();
py::object rand_mod = py::import("fake_module");
}catch(boost::python::error_already_setconst &){
std::string perror_str = parse_python_exception();
std::cout <<"Error in Python: " << perror_str << std::endl;
}


Above, we've called the
parse_python_exception
function to extract the error string and print it. As this suggests, the exception data is stored statically in the Python library and not encapsulated in the exception itself. The first step in
the
parse_python_exception
function, then, is to extract that data using the
PyErr_Fetch
Python C API function
:

std::string parse_python_exception(){
PyObject *type_ptr = NULL, *value_ptr = NULL, *traceback_ptr = NULL;
PyErr_Fetch(&type_ptr, &value_ptr, &traceback_ptr);
std::string ret("Unfetchable Python error");
...


As there may be all, some, or none of the exception data available, we set up the returned string with a fallback value. Next, we try to extract and stringify the type data from the exception information:

...
if(type_ptr != NULL){
py::handle<> h_type(type_ptr);
py::str type_pstr(h_type);
py::extract<std::string> e_type_pstr(type_pstr);
if(e_type_pstr.check())
ret = e_type_pstr();
else
ret ="Unknown exception type";
}
...


In this block, we first check that there is actually a valid pointer to the type data. If there is, we construct a
boost::python::handle
to the data from which we then create a
str
object. This conversion should ensure that a valid
string extraction is possible, but to double check we create anextract object, check the object, and then perform the extraction if it is valid. Otherwise, we use a fallback string for the type
information.

Next, we perform a very similar set of steps on the exception value:

...
if(value_ptr != NULL){
py::handle<> h_val(value_ptr);
py::str a(h_val);
py::extract<std::string> returned(a);
if(returned.check())
ret += ": " + returned();
else
ret += std::string(": Unparseable Python error: ");
}
...


We append the value string to the existing error string. The value string is, for most built-in exception types, the readable string describing the error.

Finally, we extract the traceback data:

if(traceback_ptr != NULL){
py::handle<> h_tb(traceback_ptr);
py::object tb(py::import("traceback"));
py::object fmt_tb(tb.attr("format_tb"));
py::object tb_list(fmt_tb(h_tb));
py::object tb_str(py::str("\n").join(tb_list));
py::extract<std::string> returned(tb_str);
if(returned.check())
ret +=": " + returned();
else
ret += std::string(": Unparseable Python traceback");
}
returnret;
}


The traceback goes similarly to the type and value extractions, except for the extra step of formatting the traceback object as a string. For that, we import the
traceback
module. From traceback, we then extract the
format_tb
function
and call it with the handle to the traceback object. This generates a list of traceback strings which we then join into a single string. Not the prettiest printing, perhaps, but it gets the job done. Finally, we extract the C++ string type as above and append
it to the returned error string and return the entire result.

In the context of the earlier error, the application now generates the following output:

Error in Python: : No module named fake_module

Generally speaking, this function will make it much easier to get to the root cause of problems in your embedded Python code. One caveat: if you are configuring a custom Python environment (especially module paths) for your embedded interpreter, the
parse_python_exception

function may itself throw a
boost::error_already_set
when it attempts to load the traceback module, so you may want to wrap the call to the function in a
try...catch
block and parse only the type and value pointers out of the result.

As I mentioned above, in
Part 3 I will walk through the implementation of a configuration parser built on top of the
ConfigParser
Python module. Assuming, of course, that I don't get waylaid again.

In Part 2 of this tutorial, I covered a methodology for handling exceptions thrown from embedded Python code from within the C++ part of
your application. This is crucial for debugging your embedded Python code. In this tutorial, we will create a simple C++ class that leverages Python functionality to handle an often-irritating part of developing real applications: configuration parsing.

In an attempt to not draw ire from the C++ elites, I am going to say this in a diplomatic way: I suck at complex string manipulations in C++. STL
strings
and
stringstreams
greatly simplify the task, but performing application-level
tasks, and performing them in a robust way, always results in me writing more code that I would really like. As a result, I recently rewrote the configuration parsing mechanism from Granola Connect (the daemon in
Granola Enterprise that handles communication with the Granola REST API) using embedded Python and specifically the
ConfigParser

module.

Of course, string manipulations and configuration parsing are just an example. For Part 3, I could have chosen any number of tasks that are difficult in C++ and easy in Python (web connectivity, for instance), but the configuration parsing class is a simple
yet complete example of embedding Python for something of actual use. Grab the code from theGithub repo for this tutorial to play along.

First, let’s create a class definition that covers very basic configuration parsing: read and parse INI-style files, extract string values given a name and a section, and set string values for a given section. Here is the class declaration:

class ConfigParser{
private:
boost::python::object conf_parser_;

void init();
public:
ConfigParser();

bool parse_file(const std::string &filename);
std::string get(const std::string &attr,
const std::string §ion = "DEFAULT");
void set(const std::string &attr,
const std::string &value,
const std::string §ion = "DEFAULT");
};


The
ConfigParser
module offers far more features than we will cover in this tutorial, but the subset we implement here should serve as a template for implementing more complex functionality. The implementation of the class is fairly simple;
first, the constructor loads the main module, extracts the dictionary, imports the
ConfigParser
module into the namespace, and creates a
boost::python::object
member variable holding a
RawConfigParser
object:

ConfigParser::ConfigParser(){
py::object mm = py::import("__main__");
py::object mn = mm.attr("__dict__");
py::exec("import ConfigParser", mn);
conf_parser_ = py::eval("ConfigParser.RawConfigParser()", mn);
}


The file parsing and the getting and setting of values is performed using this
config_parser_
object:

bool ConfigParser::parse_file(const std::string &filename){
return py::len(conf_parser_.attr("read")(filename)) == 1;
}

std::string ConfigParser::get(const std::string &attr, const std::string §ion){
return py::extract<std::string>(conf_parser_.attr("get")(section, attr));
}

void ConfigParser::set(const std::string &attr, const std::string &value, const std::string §ion){
conf_parser_.attr("set")(section, attr, value);
}

In this simple example, for the sake of brevity exceptions are allowed to propagate. In a more complex environment, you will almost certainly want to have the C++ class handle and repackage the Python exceptions as C++ exceptions. This way you could later
create a pure C++ class if performance or some other concern became an issue.

To use the class, calling code can simply treat it as a normal C++ class:

int main(){
Py_Initialize();
try{
ConfigParser parser;
parser.parse_file("conf_file.1.conf");
cout << "Directory (file 1): " << parser.get("Directory", "DEFAULT") << endl;
parser.parse_file("conf_file.2.conf");
cout << "Directory (file 2): " << parser.get("Directory", "DEFAULT") << endl;
cout << "Username: " << parser.get("Username", "Auth") << endl;
cout << "Password: " << parser.get("Password", "Auth") << endl;
parser.set("Directory", "values can be arbitrary strings", "DEFAULT");
cout << "Directory (force set by application): " << parser.get("Directory") << endl;
// Will raise a NoOption exception
// cout << "Proxy host: " << parser.get("ProxyHost", "Network") << endl;
}catch(boost::python::error_already_set const &){
string perror_str = parse_python_exception();
cout << "Error during configuration parsing: " << perror_str << endl;
}
}


And that's that: a key-value configuration parser with sections and comments in under 50 lines of code. This is just the tip of the iceberg too. In almost the same length of code, you can do all sorts of things that would be at best painful and at worse
error prone and time consuming in C++: configuration parsing, list and set operations, web connectivity, file format operations (think XML/JSON), and myriad other tasks are already implemented in the Python standard library.

In Part 4, I'll take a look at how to more robustly and generically call Python code using functors and a Python namespace class.

In
Part 2 of this ongoing tutorial, I introduced code for parsing Python exceptions from C++. In

Part 3, I implemented a simple configuration parsing class utilizing the Python
ConfigParser
module. As part of that implementation, I mentioned that for a project of any scale, one would want to catch and deal with Python exceptions within the class, so that clients of the class wouldn’t have to know about the details of
Python. From the perspective of a caller, then, the class would be just like any other C++ class.

The obvious way of handling the Python exceptions would be to handle them in each function. For example, the
get
function of the C++
ConfigParser
class we created would become:

std::string ConfigParser::get(const std::string &attr, const std::string §ion){
try{
return py::extract(conf_parser_.attr("get")(section, attr));
}catch(boost::python::error_already_set const &){
std::string perror_str = parse_python_exception();
throw std::runtime_error("Error getting configuration option: " + perror_str);
}
}


The error handling code remains the same, but now the
main
function becomes:

int main(){
Py_Initialize();
try{
ConfigParser parser;
parser.parse_file("conf_file.1.conf");
...
// Will raise a NoOption exception
cout << "Proxy host: " << parser.get("ProxyHost", "Network") << endl;
}catch(exception &e){
cout << "Here is the error, from a C++ exception: " << e.what() << endl;
}
}


When the Python exception is raised, it will be parsed and repackaged as a
std::runtime_error
, which is caught at the caller and handled like a normal C++ exception (i.e. without having to go through the
parse_python_exception
rigmarole). For a project that only has a handful of functions or a class or two utilizing embedded Python, this will certainly work. For a larger project, though, one wants to avoid the large amount of duplicated code and
the errors it will inevitably bring.

For my implementation, I wanted to always handle the the errors in the same way, but I needed a way to call different functions with different signatures. I decided to leverage another powerful area of the
boost
library: the functors library, and specifically
boost::bind
and
boost::function
.
boost::function
provides functor class wrappers, and
boost::bind
(among other things) binds arguments to functions. The two together, then, enable the passing of functions and their arguments that can be called at a later time. Just what the doctor ordered!

To utilize the functor, the function needs to know about the return type. Since we're wrapping functions with different signatures, a function template does the trick nicely:

template <class return_type>
return_type call_python_func(boost::function<return_type ()> to_call, const std::string &error_pre){
std::string error_str(error_pre);

try{
return to_call();
}catch(boost::python::error_already_set const &){
error_str = error_str + parse_python_exception();
throw std::runtime_error(error_str);
}
}


This function takes the functor object for a function calling
boost::python
functions. Each function that calls
boost::python
code will now be split into two functions: the private core function that uses the Python functionality and a public wrapper function that uses the
call_python_func
function. Here is the updated
get
function and its partner:

string ConfigParser::get(const string &attr, const string §ion){
return call_python_func<string>(boost::bind(&ConfigParser::get_py, this, attr, section),
"Error getting configuration option: ");
}

string ConfigParser::get_py(const string &attr, const string §ion){
return py::extract<string>(conf_parser_.attr("get")(section, attr));
}


The
get
function binds the passed-in arguments, along with the implicit this pointer, to the
get_py
function, which in turn calls the
boost::python
functions necessary to perform the action. Simple and effective.

Of course, there is a tradeoff associated here. Instead of the repeated code of the
try...catch
blocks and Python error handling, there are double the number of functions declared per class. For my purposes, I prefer the second form, as it more effectively utilizes the compiler to find errors, but mileage may vary. The most important
point is to handle Python errors at a level of code that understands Python. If your entire application needs to understand Python, you should consider rewriting in Python rather than embedding, perhaps with some C++ modules as needed.

As always, you can follow along with the tutorial by cloning the
github repo.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: