somename wrote:If you don't trust Python.org's checksum-ed binaries, why trust their source (and all the Python modules from authors who usually trust Python.org or Linux distro binaries)?
There is a chance that people can read the source code and understand what is going on and identify possible issues. How many people do you think there are in the world that can read the contents of binaries and understand what is going on? I would venture to guess the number is zero in the case of something as large as cpython.
Reproducible builds (if the resulting hashes of the binaries are consistent among all the builders) tell users that either the binaries actually came from the source code, or that all the builders are lying to us in a consistent way. If the latter case isn't considered for now, then reproducible builds allow us to step back and
examine the binaries by examining the source code. And since many more people can read source code, there are more eyes on the code to identify possible issues with the code.
The whole point is to avoid trusting individual people and groups as much as possible. Just because many people may trust Python binaries from the official website or Linux distros currently doesn't mean that we should not try to progress beyond trusting binaries where possible. Even if you trust the Python Software Foundation to not purposefully add a backdoor to the Python binaries that they publish, they can't guarantee that the build process was not somehow compromised,
unless they used reproducible builds. And I do recognize that it may just not be currently possible to make something as large as Python reproducible across all platforms without some effort. So I'm not attacking the Python Software Foundation for not having reproducible builds right now. But where possible, I think reproducibility should be the goal. You mention Linux distro binaries. You may know that Debian is moving along nicely with their reproducible build effort and other distros are looking to follow in Debian's footsteps. So steps are being taken by Debian at least to make the entire archive of packages reproducible. So eventually their Python packages will be reproducible. The tricky issue is building Python reproducibly for platforms other than Linux.
Yes, you do have to trust to some extent, but maybe one day when all open source software (including OSes) are being built reproducibly, that trust will be limited to hardware for users of open source software instead of hardware and software as we do now.
Or maybe reproducible builds will all just be a passing fad. Who knows.
I also recommend the talk from CCCongress 2014.