Python Reproducible Builds?

Post Reply
biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Python Reproducible Builds?

Post by biolizard89 »

I understand that Joseph is working on reproducible builds for Armory (which uses Python). @Joseph, could you provide some information on how applicable this work is to NMControl? What kinds of changes would be needed for NMControl to build reproducibly (for Windows, OS X, and Linux) using your work?
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

phelix
Posts: 1634
Joined: Thu Aug 18, 2011 6:59 am

Re: Python Reproducible Builds?

Post by phelix »

biolizard89 wrote:I understand that Joseph is working on reproducible builds for Armory (which uses Python). @Joseph, could you provide some information on how applicable this work is to NMControl? What kinds of changes would be needed for NMControl to build reproducibly (for Windows, OS X, and Linux) using your work?
+1, that sounds interesting.

btw I saw there was a speech about reproducible builds at cc-camp-2015 - it should be available online but I did not yet watch it myself.
nx.bit - some namecoin stats
nf.bit - shortcut to this forum

josephbisch
Posts: 69
Joined: Sun Nov 23, 2014 3:34 pm
os: linux

Re: Python Reproducible Builds?

Post by josephbisch »

I saw the cccamp 2015 talk by Lunar about reproducible builds. It is interesting to watch, especially if you aren't already intimately familiar with the reproducible build pages on the Debian wiki. While I was already familiar with it, I learned that debbindiff (the binary diff tool used by the Debian reproducible build project) had been renamed to diffoscope. There is also a talk from Lunar that just happened at DebConf and a reproducible builds BoF (a BoF is a gathering of people to discuss a common interest) available here, though it can be a little hard to hear what people are saying on those BoF recordings.

As for the actual question, I'll start with Linux. Originally I was just using Gitian to create a compressed archive of the Armory source tree after having built Armory (which really just involves creating a shared object (.so) file from the cpp code, which then gets loaded by the Python code via SWIG). Since Armory traditionally was distributed as a deb package, we switched to using Debian's reproducible build toolchain. Since it is based around reproducing deb packages it seemed like a better choice for the specific case of reproducing deb packages than Gitian was. A particular issue was that Gitian doesn't currently allow root permissions in the VM, which can be a problem with deb packaging in many cases. So in combination with a modified version of the existing make_deb_package.py script, I was able to make a reproducible deb package. I think it just happened to be reproducible with the use of the Debian toolchain, but there were some challenges. The Debian toolchain added -Wdate-time to CPPFLAGS, which is a GCC >=4.9 feature, so it caused problems when building using an Ubuntu Trusty chroot (GCC 4.8.2). So I added "export DEB_CPPFLAGS_MAINT_STRIP=-Wdate-time" to the rules*.template files to prevent the toolchain from adding that flag. I also needed a way to download the dependencies for the target OS for the offline bundle (Ubuntu Precise) no matter what distro-version combo a builder is using for the host. So I created an apt config file that points to a sources.list file for Precise in the Armory source tree and used that config file when resolving and downloading Armory dependencies in the make_deb_package.py script. I also figured out a method to determine the packages that are installed by default on whatever OS we choose (in our case Precise) and exclude those default packages from the recursive resolution and download of the dependency packages for the offline bundle. You can see the method in make_deb_package.py or I can explain it if you want.

So it should hopefully just work for making a reproducible deb package of NMControl too (especially if it is just pure Python like I think it is). Armory also has the spec files and instructions for an rpm package, but that is unofficial/unsupported and provided as a convenience for users. It is not reproducible (or at least isn't intended to be). I don't know what the status is on the Fedora reproducible build project, but it isn't something that really affects Armory since they don't have an official rpm package.

For Windows and OS X, we are going with Gitian. That is trickier, because we traditionally provided an app file for OS X with Python and Qt embedded in it, so that users wouldn't need to install Python and/or Qt themselves. I haven't been able to get cross-compiling of Python 2 from Linux for OS X to work, so it would be impossible to do the full build with Linux VMs. Python 3 is much easier to cross-compile for OS X (still requires patches though), so I was able to make an app with Python 3 and Qt 5. The problem is that Armory itself is still on Python 2 and Qt 4, so when running the app, you aren't actually taking advantage of the embedded Python 3 and Qt 5 and you still need to manually install Python 2 and Qt 4. But the app does run. So the OS X app is just waiting on the Armory transition to Python 3/Qt 5.

The Windows version is traditionally distributed as an exe using py2exe, which is a Windows only program. So we were looking at other Python freezing methods and settled on pyqtdeploy. I worked on just trying to get it to work outside of Gitian first, before integrating it into a Gitian descriptor. I ended up running into issues with building Python from Linux. First of all, Python 2 was a no-go. Python 3 actually compiled eventually (with many patches). I ran into issues though, because of the ssl module. I got a rather cryptic message about some "slot error" or something like that. While I was pleasantly surprised that the resulting exe ran at all on Windows once the ssl stuff was removed (and everything seemed to work fine), the fact that ssl wasn't available made it not possible for us to actually use the exe the way it was. I moved onto other stuff then.

So, unfortunately it doesn't look possible to make an exe using Linux that embeds Python in the exe and uses the ssl module, unless someone figures out what the error is about.

So it looks like we could just adopt the Linux and OS X stuff if we are going with Python 3 and hopefully it will just work. Though we have to try it to see if there are any sources of variance that are particular to NMControl. Windows is trickier and I'm not sure if there is a good way to get an exe of Python code from Linux if we don't want the user to have to install Python manually.

biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Re: Python Reproducible Builds?

Post by biolizard89 »

Okay, cool -- thanks for the detailed explanation. I think we're going to switch to Python3 sometime soon (unless something comes up unexpectedly), so that should be fine. I actually don't think NMControl uses the Python ssl module (I think we're using PyOpenSSL for such things instead). If I'm correct on that, does that make it feasible to create a Windows .exe using the stuff you attempted?
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

josephbisch
Posts: 69
Joined: Sun Nov 23, 2014 3:34 pm
os: linux

Re: Python Reproducible Builds?

Post by josephbisch »

biolizard89 wrote:Okay, cool -- thanks for the detailed explanation. I think we're going to switch to Python3 sometime soon (unless something comes up unexpectedly), so that should be fine. I actually don't think NMControl uses the Python ssl module (I think we're using PyOpenSSL for such things instead). If I'm correct on that, does that make it feasible to create a Windows .exe using the stuff you attempted?
If you are using PyOpenSSL, it looks like you can get away without the ssl module. It does make it feasible to make a Windows .exe, but I haven't tried to make it reproducible yet, because I stopped when I encountered the ssl issue.

Check out this PR. The instructions can use some streamlining and it isn't using Gitian yet, as I said.

biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Re: Python Reproducible Builds?

Post by biolizard89 »

By the way, "slot error" sounds Qt-specific, since I don't know of any Python language feature that is called "slots". I don't fully grok the details of how you're freezing the Python libs, but is it possible that's a Qt issue rather than a Python or OpenSSL issue? (I assume you already looked into this, I'm just curious.)
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

josephbisch
Posts: 69
Joined: Sun Nov 23, 2014 3:34 pm
os: linux

Re: Python Reproducible Builds?

Post by josephbisch »

biolizard89 wrote:By the way, "slot error" sounds Qt-specific, since I don't know of any Python language feature that is called "slots". I don't fully grok the details of how you're freezing the Python libs, but is it possible that's a Qt issue rather than a Python or OpenSSL issue? (I assume you already looked into this, I'm just curious.)
I don't think it is Qt-related. Python does have slots. But I guess anything is possible and it could be a Qt issue that is somehow affecting the import of any Python modules that in turn import the ssl module.

I found this issue, which seems to be the same as mine. The bug reporter claimed that a "redownload of the original package" fixed the issue, but I'm not sure what package is specifically talked about. I tried multiple versions of OpenSSL and Python, so by trying mutliple versions, I should have achieved the same effect as redownloading the source of both Python and OpenSSL.

Then there is this SO post, which is also the same issue I was experiencing. The resolution ended up being that the wrong version of OpenSSL was being chosen and that had to be prevented. So I removed all instances of OpenSSL dev files except for the ones I generated from the OpenSSL source code, which was cross-compiled with Mingw. Still experienced the same issue. I'm not sure what would happen if I tried using an OpenSSL binary or a version natively compiled on Windows instead of compiling myself. Maybe there is an issue with using Mingw to cross compile, though that would be surprising given that everything else compiled fine with Mingw. If there is indeed an issue with compiling OpenSSL with Mingw from Linux, then I'm not even sure how to go about debugging that.

biolizard89
Posts: 2001
Joined: Tue Jun 05, 2012 6:25 am
os: linux

Re: Python Reproducible Builds?

Post by biolizard89 »

Hmm, interesting, didn't know that Python had slots. Thanks for educating me. :)
Jeremy Rand, Lead Namecoin Application Engineer
NameID: id/jeremy
DyName: Dynamic DNS update client for .bit domains.

Donations: BTC 1EcUWRa9H6ZuWPkF3BDj6k4k1vCgv41ab8 ; NMC NFqbaS7ReiQ9MBmsowwcDSmp4iDznjmEh5

Post Reply