After a couple days of on & off frustrated debugging, I've just discovered a bug in
ioutil.Discard.
The original version of ioutil.Discard was safe:
https://code.google.com/p/go/source/detail?r=cb593b108aa5 , it only had Write method.
A later commit added a ReadFrom method:
https://code.google.com/p/go/source/detail?r=13faa632ba3a#
That ReadFrom CL uses an intentionally-racy global variable, blackHole, which is re-used
for all Read calls.
io.Copy uses ReadFrom, if available:
func Copy(dst Writer, src Reader) (written int64, err error) {
// If the writer has a ReadFrom method, use it to do the copy.
// Avoids an allocation and a copy.
if rt, ok := dst.(ReaderFrom); ok {
return rt.ReadFrom(src)
}
......
The race detector had a problem with this,
https://golang.org/issue/3970 , and so worked around it by using a
safe version: https://code.google.com/p/go/source/detail?r=1e55cf10aa4f#
And I just hit this problem in a real program, getting corrupt SHA-1 results from files.
In my program, I had a io.Reader wrapper which calculated the SHA-1 of contents as they
were read.
In some cases, my code did:
io.Copy(ioutil.Discard, sha1TrackingReader)
... which resulted in calls to ioutil.Discard.ReadFrom(sha1TrackingReader) and meant
that my sha1TrackingReader got the same racy buffer provided to it from multiple
concurrent goroutines.
Hence my SHA-1 corruption.
My workaround fix
(perkeep/perkeep@22621b80f67afe117189cbf15467e84eb79ec8d8)
was:
_, err = io.Copy(struct{ io.Writer }{ioutil.Discard}, td)
Considering how much time it took me to debug this, how surprising it was, and how it
broke composition of standard interface types, I think this should be fixed. If we do a
Go1.0.4, I'd like it to be fixed there, too.
The memory allocation reduction (of reusing the global variable) can be obtained with a
simple byte pool channel instead.