Porting to Python 3 - The Book Site

Migration strategies

Making a new release of software that is backwards incompatible is a high risk strategy. When people need to rewrite their software, or maintain separate versions of their source code to support both versions of a language or a framework, the risk is that they never make the transition and stay on the old version forever, or worse, that they switch to another framework.

For that reason Python versions 2.6 and later include support for migrating to Python 3, in the form of 2to3, a program and package that can convert code from Python 2 to Python 3. There are other techniques and strategies you can use and there are also different ways to use 2to3. Which strategy to use depends very much on what type of software you are converting.

Only supporting Python 3

The easiest case is when you only need to support one version of Python at a time. In these cases you can just convert your code to Python 3 and forget about Python 2. With this strategy you will first use the 2to3 tool to make an automatic conversion of most of the changes and then fix every problem that remains manually in the Python 3 code. You will probably also want to go through all of the converted code and clean it up, as the 2to3 conversions may not always be the most elegant solution for your case.

Separate branches for Python 2 and Python 3

If you need to continue to support Python 2, the simplest case is having a branch in your source tree with the code for Python 2 and another branch for Python 3. You will then have to make every change on the two different branches, which is a bit more work, but feasible if the code doesn’t change that often.

One problem with this strategy is that your distribution becomes complex, because you now have two distributions and you need to make sure that Python 3 users get the Python 3 version and Python 2 users get the Python 2 version of your package. Solutions for this are documented in Distributing packages.

Converting to Python 3 with 2to3

The officially recommended way of supporting both Python 2 and Python 3 is to maintain the source code in a Python 2 version and convert it with 2to3 for Python 3 support. That means you will have to run 2to3 each time you have made a code change so you can test it under Python 3.

To do this you need a script that performs the 2to3 conversion, because doing all the steps manually quickly becomes really boring. Since you need to do the conversion every time you have changed something so you can run the tests under Python 3, you want to run the conversion only on the files that have been modified as the conversion can be rather slow. That means that the conversion script should compare time stamps on the files to see which ones have been modified and convert only them, which means the script can not be a trivial shell script.

You can of course write these conversion scripts yourself, but you might not need to. If you are using Distutils it has support for running 2to3 as a part of the build process. This also solves the distribution problems, as this way you can distribute only Python 2 code and 2to3 will be run on that code during install when installed on Python 3. That way you don’t have to have separate packages or even two copies of the code in your package. Distributing packages also has information on this.

However, the lazy coders approach here would be to use Distribute.

Using Distribute to support the 2to3 conversion

Distribute [2] is a fork of Phillip J. Eby’s popular Setuptools package and provides Python 3 compatibility, as well as extensions simplifying the support of Python 2 and Python 3 from the same source. Basically what Distribute has done is to extend the principles of Distutils build_py_2to3 command and integrated 2to3 into all parts of the packaging story.

With Distribute you can add a few extra parameters in the setup.py file to have 2to3 run the conversion at build time. This means you only need to have one version of the source in your version control system and you therefore only need to fix bugs once. You also need only one source release, so you only have to release the software once and there is only one package to download and install for both Python 2 and Python 3.

You still need to run your tests under all versions of Python that you want to support, but Distribute includes a test command that will convert your code with 2to3 before running the tests. You can easily set up your package to use that. Then testing becomes just running python setup.py test once for every python version you want to support.

The main drawback with this solution is that you can’t use the earliest versions of 2to3, because they are too buggy. In practice it means you need to have Python 3.1 or later installed on the target machine. This is generally not a problem, as most platforms that support Python 3 already use Python 3.1 for that support.

You can find examples of how to set up your module or package to use Distribute for your Python 3 support under Supporting multiple versions of Python with Distribute as well as in the standard Distribute documentation [3].

Python 2 and Python 3 without conversion

In many cases it’s often perfectly feasible to modify the code so that it runs under both Python 2 and Python 3 without needing any conversion, although you have to apply several tricks to avoid the incompatibilities between Python 2 and Python 3.

There are also cases where you can’t use Distribute, or don’t want to. You may need to distribute your code in a format that is not installable with Distutils and therefore not Distribute. In those cases you can’t use Distributes 2to3 support and then using 2to3 is more work and not using 2to3 may be a more attractive prospect.

Python 2.6 and 2.7 has a lot of forward compatibility, making supporting Python 2.6 and Python 3 much easier than supporting Python 2.5 and Python 3 without conversion. Supporting 2.5 or even older versions therefore means you have to employ even more tricks, especially when it comes to supporting the print and exception statements.

Although Benjamin Petersons excellent six module [4] helps by wrapping much of these incompatibilities I would generally recommend using Distribute and 2to3 over trying to run the code on both Python 2 and Python 3 without 2to3. It is perfectly possible and it has been done even for some rather large projects, but it is generally not worth the trouble. I’d recommend you to make a serious effort with 2to3 first.

However, even if you do use 2to3 for your project as a whole, you still may end up with having to write some code so it runs on both Python 2 and Python 3 without conversion. This is useful for bootstrapping scripts and setup scripts or if your code generates code from strings, for example to create command line scripts. You can of course have two separate strings depending on the Python version, or even run 2to3 on the string using lib2to3. However, in these cases it’s generally easier to make the generated code snippets run on all Python versions without 2to3.

More information on the techniques necessary to do this is in the chapter Supporting Python 2 and 3 without 2to3 conversion.

Using 3to2

The 2to3 tool is flexible enough for you to define what changes should be done by writing “fixers”. Almost any kind of Python code conversion is imaginable here and 3to2 [1] is a set of fixers written by Joe Amenta that does the conversion from Python 3 to Python 2. This enables you to write your code for Python 3 and then convert it to Python 2 before release.

However, there is no Distribute support for 3to2 and also Python 2.5 or earlier also do not include the required lib2to3 package. Therefore 3to2 currently remains only an interesting experiment, although this may change in the future.

Which strategy is for you?

Applications

Unless your code is a reusable package or framework you probably do not need to support older versions of Python, unless some of your customers are stuck on Python 2 while others demand that you support Python 3. In most cases of you can just drop Python 2 support completely.

Python modules and packages

If you are developing some sort of module or package that other Python developers use you would probably like to support both Python 2 and Python 3 at the same time. The majority of your users will run Python 2 for some time to come, so you want to give them access to new functionality, but if you don’t support Python 3, the users of Python 3 must find another package to fulfill their need.

If the package is stable from a functional standpoint, it might be perfectly reasonable to have separate branches in your version control system and make bugfixes on both branches separately. In this case you run 2to3 on your main branch once and fix it up until it looks nice and passes your tests again. You would have to distribute both Python 2 and Python 3 code, which can be done by including both code trees in the package and letting setup.py install the right version.

If your package is under active development you probably want to support both Python 2 and Python 3 at the same time from the same code base. If you want to use the official 2to3 conversion method, or if you want to try to get the code running under both Python 2 and Python 3 without a conversion step depends mostly on what your code does. There are cases where you won’t be able to run the same code under Python 2 and Python 3 without a lot of effort because it relies so much on Python internals that the code becomes too different. There are also cases where the code is so straightforward that running 2to3 on it hardly makes any sense.

Most code is somewhere in between and the decision is not always easy. A good idea is to run 2to3 on your code and look at the differences. If 2to3 makes a lot of changes in your code, then you probably want to use it to convert the code to minimize the amount of workarounds.

If you are already releasing your package using Distutils or its descendants Setuptools and Distribute, then using Distributes 2to3 support is easy. Even if you are not using any of these it may be less work for you to start using them than to write the support scripts for 2to3 conversion yourself. With Distribute all of the work needed to use 2to3 has been done for you already and you need only to worry about the actual porting.

Frameworks

The benefit of using frameworks when developing doesn’t only come from the framework itself, but also from the plugins and extensions available to it. It is therefore important to make it easy for all developers using and extending the framework to switch to Python 3, as you otherwise risk to be stuck on Python 2 forever.

If your framework is extended by writing Python packages that uses Distutils, Setuptools or Distribute as a packaging system this means the users of your framework are already in a good position, as they can use Distutils or Distribute for the 2to3 conversion to support both new and old versions of the framework. If extensions are packaged and distributed in some other way than with Distutils you may want to consider making your own set of support scripts to make the transition easier.

Summary

In general, if you write end-user software, you can just switch to Python 3, starting with a one-time run of 2to3 on your code. If you write a Python package you want to support both Python 2 and Python 3 at the same time, and the easiest way to do that is to use Distributes 2to3-support.

Footnotes

[1]http://pypi.python.org/pypi/3to2
[2]http://pypi.python.org/pypi/distribute
[3]http://packages.python.org/distribute/python3.html
[4]http://pypi.python.org/pypi/six