On Wed, Apr 15, 2015 at 12:04 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> I think we definitely need to do that ASAP. And possibly then force
> an immediate minor release. Bugs that eat your data are bad, and we
> have a customer hitting this completely independently of this report,
> which makes this look like more than a theoretical problem.
>
>
I defer to others on what a good timeline would be, but failing harder here
would be good. We were somewhat lucky in discovering the issue as fast as
we did. Post wrap, 99.9+% of the queries to our database were returning
fine without error. We only observed the failure when accessing two
specific rows in two different tables (there were possibly more, it's a
multi-TiB db cluster). Had we not hit those rows for a few days instead of
a few hours recovery would have been extremely difficult as rolling back to
a known good state wouldn't have really been an option.
Tim