Skip to content

refactor retry logic on mutate row#4665

Merged
theacodes merged 14 commits intogoogleapis:masterfrom
zakons:feature/refactor_retry_logic_on_mutate_row
Feb 5, 2018
Merged

refactor retry logic on mutate row#4665
theacodes merged 14 commits intogoogleapis:masterfrom
zakons:feature/refactor_retry_logic_on_mutate_row

Conversation

@aneepct
Copy link
Contributor

@aneepct aneepct commented Dec 27, 2017

Added retry logic for commit on a DirectRow. The retry uses the retry.py module and performs retry when a Rendezvous exception is thrown due to a disconnect on the channel. The retry will call MutateRow on the original protobuf request with a timeout backoff algorithm provided in retry.py. This has been tested using an endurance test framework that runs for 23 hours and was receiving 8+ exceptions previously. Now all the retries are successful.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Dec 27, 2017
@sduskis
Copy link
Contributor

sduskis commented Dec 27, 2017

Can you please add a description for the pull request?

Copy link
Contributor

@theacodes theacodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR also needs tests.

bigtable_pb2 as messages_v2_pb2)
from google.api_core import retry
from google.cloud import exceptions
from functools import partial, update_wrapper

This comment was marked as spam.

This comment was marked as spam.

"""The maximum number of mutations that a row can accumulate."""


def wrapped_partial(func, *args, **kwargs):

This comment was marked as spam.

This comment was marked as spam.

self._table,
request_pb)
retry_ = retry.Retry(
predicate=retry.if_exception_type(exceptions.GrpcRendezvous),

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

mutations_list.extend(to_append)


def call_mutate_row(table, request_pb):

This comment was marked as spam.

This comment was marked as spam.

@tseaver tseaver added the api: bigtable Issues related to the Bigtable API. label Jan 2, 2018
@chemelnucfin chemelnucfin added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jan 15, 2018
@theacodes
Copy link
Contributor

@aneepct this PR has open comments and little activity, do you still intend to merge this?

@aneepct
Copy link
Contributor Author

aneepct commented Jan 17, 2018

I am actively resolving all the review questions now and will advance to the state needed for merging. Please also see question from @zakons above.

@zakons
Copy link
Contributor

zakons commented Jan 18, 2018

All questions / requests have been responded to. If it is not possible in the current release to get the mapping to exceptions.ServiceUnavailable from the _Rendezvous exception (see comment above), this could be done later when that mapping is available. Can we move forward with the merge?

Copy link
Contributor

@theacodes theacodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still needs tests.

self._table,
request_pb)
retry_ = retry.Retry(
predicate=retry.if_exception_type(exceptions.GrpcRendezvous),

This comment was marked as spam.

return partial_func


def call_mutate_row(table, request_pb):

This comment was marked as spam.

mutations_list.extend(to_append)


def wrapped_partial(func, *args, **kwargs):

This comment was marked as spam.

BadGateway = exceptions.BadGateway
ServiceUnavailable = exceptions.ServiceUnavailable
GatewayTimeout = exceptions.GatewayTimeout
RetryError = exceptions.RetryError

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@zakons
Copy link
Contributor

zakons commented Jan 19, 2018

Thanks. Almost there. Just not sure how to do the checking for ServiceUnavailable (the exception that we want). Here is the retry logic with the predicate you are recommending:

retry_commit = functools.partial(
    _call_mutate_row, 
     table=self._table, 
     request_pb=request_pb)

retry_ = retry.Retry(
    predicate=retry.if_exception_type(exceptions.from_grpc_error),
    deadline=30)

retry_(retry_commit)()

Where does the checking for exceptions.ServiceUnavailable go?

@theacodes
Copy link
Contributor

@zakons you'll need to basically write a custom predicate function:

def _should_retry_commit_exception(exc):
    if isinstance(exc, grpc.RpcError):
        exc = exceptions.from_grpc_error(exc)
    return isinstance(exc, exceptions.ServiceUnavailable)

...

retry_commit = functools.partial(
    _call_mutate_row,
    table=self._table,
    request_pb=request_pb)

retry_ = retry.Retry(
    predicate=_should_retry_commit_exception,
    deadline=30)

retry_(retry_commit)()

@aneepct
Copy link
Contributor Author

aneepct commented Jan 26, 2018

@tseaver @jonparrott @sduskis The circleci coverage is now at 100% and our endurance test looks good, can we please have this PR merged? Many thanks for your help.

@tseaver
Copy link
Contributor

tseaver commented Jan 26, 2018

LGTM if @jonparrott says go.


import struct

import functools

This comment was marked as spam.


import functools
import six
import grpc

This comment was marked as spam.

data_pb2 as data_v2_pb2)
from google.cloud.bigtable._generated import (
bigtable_pb2 as messages_v2_pb2)
from google.api_core import retry

This comment was marked as spam.

return isinstance(exc, exceptions.ServiceUnavailable)


def _call_mutate_row(table, request_pb):

This comment was marked as spam.

client = self._table._instance._client
client._data_stub.MutateRow(request_pb)

retry_commit = functools.partial(

This comment was marked as spam.


expected_result = True
result = _check_rendezvous_exception(
grpc._channel._Rendezvous(State, None, None, 0))

This comment was marked as spam.

BadGateway = exceptions.BadGateway
ServiceUnavailable = exceptions.ServiceUnavailable
GatewayTimeout = exceptions.GatewayTimeout
RetryError = exceptions.RetryError

This comment was marked as spam.

@aneepct
Copy link
Contributor Author

aneepct commented Jan 30, 2018

@jonparrott How do i use RpcError and set the code to UNAVAILABLE?

@aneepct
Copy link
Contributor Author

aneepct commented Jan 31, 2018

@jonparrott @zakons OK, we are at 100% coverage and all of your review comments have been incorporated. Please review and if OK, please merge.

@zakons
Copy link
Contributor

zakons commented Feb 1, 2018

@jonparrott @tseaver Can we please merge this PR now that all requested changes have been applied? Our endurance test results are all fine as well, meaning that the UNAVAILABLE exception is caught and the commit retried until success.

@theacodes
Copy link
Contributor

@zakons @aneepct just a heads up - I'm traveling and will not have a chance to review this until later next week. Apologies for the delay, and thank you for your patience.

@theacodes theacodes merged commit a5bb904 into googleapis:master Feb 5, 2018
@theacodes
Copy link
Contributor

Thanks everyone! Sorry for the protracted review process here, I'm glad we ended up with something short and sweet. :)

@zakons
Copy link
Contributor

zakons commented Feb 5, 2018

@jonparrott Thank-you! We were impressed with your reviews. In fact, I have been looking at your python testing style guide and am finding it very clear and helpful.

@zakons zakons deleted the feature/refactor_retry_logic_on_mutate_row branch February 5, 2018 21:56
@theacodes
Copy link
Contributor

Awesome, I'm glad to be helpful. :)

chemelnucfin pushed a commit to chemelnucfin/google-cloud-python that referenced this pull request Feb 8, 2018
parthea pushed a commit that referenced this pull request Nov 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigtable Issues related to the Bigtable API. cla: yes This human has signed the Contributor License Agreement. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants

Comments