Updated DB_Helper by adding firebase methods.

This commit is contained in:
Batuhan Berk Başoğlu 2020-10-05 16:53:40 -04:00
parent 485cc3bbba
commit c82121d036
1810 changed files with 537281 additions and 1 deletions

View file

@ -0,0 +1,167 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Bigtable HappyBase package.
This package is intended to emulate the HappyBase library using
Google Cloud Bigtable as the backing store.
Differences in Public API
-------------------------
Some concepts from HBase/Thrift do not map directly to the Cloud
Bigtable API. As a result, the following instance methods and functions
could not be implemented:
* :meth:`Connection.enable_table() \
<gcloud.bigtable.happybase.connection.Connection.enable_table>` - no
concept of enabled/disabled
* :meth:`Connection.disable_table() \
<gcloud.bigtable.happybase.connection.Connection.disable_table>` - no
concept of enabled/disabled
* :meth:`Connection.is_table_enabled() \
<gcloud.bigtable.happybase.connection.Connection.is_table_enabled>`
- no concept of enabled/disabled
* :meth:`Connection.compact_table() \
<gcloud.bigtable.happybase.connection.Connection.compact_table>` -
table storage is opaque to user
* :meth:`Table.regions() <gcloud.bigtable.happybase.table.Table.regions>`
- tables in Cloud Bigtable do not expose internal storage details
* :meth:`Table.counter_set() \
<gcloud.bigtable.happybase.table.Table.counter_set>` - method can't
be atomic, so we disable it
* The ``__version__`` value for the HappyBase package is :data:`None`.
However, it's worth nothing this implementation was based off HappyBase
0.9.
In addition, many of the constants from
:mod:`connection <gcloud.bigtable.happybase.connection>`
are specific to HBase and are defined as :data:`None` in our module:
* ``COMPAT_MODES``
* ``THRIFT_TRANSPORTS``
* ``THRIFT_PROTOCOLS``
* ``DEFAULT_HOST``
* ``DEFAULT_PORT``
* ``DEFAULT_TRANSPORT``
* ``DEFAULT_COMPAT``
* ``DEFAULT_PROTOCOL``
Two of these ``DEFAULT_HOST`` and ``DEFAULT_PORT``, are even imported in
the main :mod:`happybase <gcloud.bigtable.happybase>` package.
Finally, we do not provide the ``util`` module. Though it is public in the
HappyBase library, it provides no core functionality.
API Behavior Changes
--------------------
* Since there is no concept of an enabled / disabled table, calling
:meth:`Connection.delete_table() \
<gcloud.bigtable.happybase.connection.Connection.delete_table>`
with ``disable=True`` can't be supported.
Using that argument will result in a warning.
* The :class:`Connection <gcloud.bigtable.happybase.connection.Connection>`
constructor **disables** the use of several
arguments and will print a warning if any of them are passed in as keyword
arguments. The arguments are:
* ``host``
* ``port``
* ``compat``
* ``transport``
* ``protocol``
* In order to make
:class:`Connection <gcloud.bigtable.happybase.connection.Connection>`
compatible with Cloud Bigtable, we add a ``instance`` keyword argument to
allow users to pass in their own
:class:`Instance <gcloud.bigtable.instance.Instance>` (which they can
construct beforehand).
For example:
.. code:: python
from gcloud.bigtable.client import Client
client = Client(project=PROJECT_ID, admin=True)
instance = client.instance(instance_id, location_id)
instance.reload()
from gcloud.bigtable.happybase import Connection
connection = Connection(instance=instance)
* Any uses of the ``wal`` (Write Ahead Log) argument will result in a
warning as well. This includes uses in:
* :class:`Batch <gcloud.bigtable.happybase.batch.Batch>`
* :meth:`Batch.put() <gcloud.bigtable.happybase.batch.Batch.put>`
* :meth:`Batch.delete() <gcloud.bigtable.happybase.batch.Batch.delete>`
* :meth:`Table.put() <gcloud.bigtable.happybase.table.Table.put>`
* :meth:`Table.delete() <gcloud.bigtable.happybase.table.Table.delete>`
* :meth:`Table.batch() <gcloud.bigtable.happybase.table.Table.batch>` factory
* When calling
:meth:`Connection.create_table() \
<gcloud.bigtable.happybase.connection.Connection.create_table>`, the
majority of HBase column family options cannot be used. Among
* ``max_versions``
* ``compression``
* ``in_memory``
* ``bloom_filter_type``
* ``bloom_filter_vector_size``
* ``bloom_filter_nb_hashes``
* ``block_cache_enabled``
* ``time_to_live``
Only ``max_versions`` and ``time_to_live`` are availabe in Cloud Bigtable
(as
:class:`MaxVersionsGCRule <gcloud.bigtable.column_family.MaxVersionsGCRule>`
and
:class:`MaxAgeGCRule <gcloud.bigtable.column_family.MaxAgeGCRule>`).
In addition to using a dictionary for specifying column family options,
we also accept instances of :class:`.GarbageCollectionRule` or subclasses.
* :meth:`Table.scan() <gcloud.bigtable.happybase.table.Table.scan>` no longer
accepts the following arguments (which will result in a warning):
* ``batch_size``
* ``scan_batching``
* ``sorted_columns``
* Using a HBase filter string in
:meth:`Table.scan() <gcloud.bigtable.happybase.table.Table.scan>` is
not possible with Cloud Bigtable and will result in a
:class:`TypeError <exceptions.TypeError>`. However, the method now accepts
instances of :class:`.RowFilter` and subclasses.
* :meth:`Batch.delete() <gcloud.bigtable.happybase.batch.Batch.delete>` (and
hence
:meth:`Table.delete() <gcloud.bigtable.happybase.table.Table.delete>`)
will fail with a :class:`ValueError <exceptions.ValueError>` when either a
row or column family delete is attempted with a ``timestamp``. This is
because the Cloud Bigtable API uses the ``DeleteFromFamily`` and
``DeleteFromRow`` mutations for these deletes, and neither of these
mutations support a timestamp.
"""
from gcloud.bigtable.happybase.batch import Batch
from gcloud.bigtable.happybase.connection import Connection
from gcloud.bigtable.happybase.connection import DEFAULT_HOST
from gcloud.bigtable.happybase.connection import DEFAULT_PORT
from gcloud.bigtable.happybase.pool import ConnectionPool
from gcloud.bigtable.happybase.pool import NoConnectionsAvailable
from gcloud.bigtable.happybase.table import Table
# Values from HappyBase that we don't reproduce / are not relevant.
__version__ = None

View file

@ -0,0 +1,326 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Bigtable HappyBase batch module."""
import datetime
import warnings
import six
from gcloud._helpers import _datetime_from_microseconds
from gcloud.bigtable.row_filters import TimestampRange
_WAL_SENTINEL = object()
# Assumed granularity of timestamps in Cloud Bigtable.
_ONE_MILLISECOND = datetime.timedelta(microseconds=1000)
_WARN = warnings.warn
_WAL_WARNING = ('The wal argument (Write-Ahead-Log) is not '
'supported by Cloud Bigtable.')
class Batch(object):
"""Batch class for accumulating mutations.
.. note::
When using a batch with ``transaction=False`` as a context manager
(i.e. in a ``with`` statement), mutations will still be sent as
row mutations even if the context manager exits with an error.
This behavior is in place to match the behavior in the HappyBase
HBase / Thrift implementation.
:type table: :class:`Table <gcloud.bigtable.happybase.table.Table>`
:param table: The table where mutations will be applied.
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the epoch)
that all mutations will be applied at.
:type batch_size: int
:param batch_size: (Optional) The maximum number of mutations to allow
to accumulate before committing them.
:type transaction: bool
:param transaction: Flag indicating if the mutations should be sent
transactionally or not. If ``transaction=True`` and
an error occurs while a :class:`Batch` is active,
then none of the accumulated mutations will be
committed. If ``batch_size`` is set, the mutation
can't be transactional.
:type wal: object
:param wal: Unused parameter (Boolean for using the HBase Write Ahead Log).
Provided for compatibility with HappyBase, but irrelevant for
Cloud Bigtable since it does not have a Write Ahead Log.
:raises: :class:`TypeError <exceptions.TypeError>` if ``batch_size``
is set and ``transaction=True``.
:class:`ValueError <exceptions.ValueError>` if ``batch_size``
is not positive.
"""
def __init__(self, table, timestamp=None, batch_size=None,
transaction=False, wal=_WAL_SENTINEL):
if wal is not _WAL_SENTINEL:
_WARN(_WAL_WARNING)
if batch_size is not None:
if transaction:
raise TypeError('When batch_size is set, a Batch cannot be '
'transactional')
if batch_size <= 0:
raise ValueError('batch_size must be positive')
self._table = table
self._batch_size = batch_size
self._timestamp = self._delete_range = None
# Timestamp is in milliseconds, convert to microseconds.
if timestamp is not None:
self._timestamp = _datetime_from_microseconds(1000 * timestamp)
# For deletes, we get the very next timestamp (assuming timestamp
# granularity is milliseconds). This is because HappyBase users
# expect HBase deletes to go **up to** and **including** the
# timestamp while Cloud Bigtable Time Ranges **exclude** the
# final timestamp.
next_timestamp = self._timestamp + _ONE_MILLISECOND
self._delete_range = TimestampRange(end=next_timestamp)
self._transaction = transaction
# Internal state for tracking mutations.
self._row_map = {}
self._mutation_count = 0
def send(self):
"""Send / commit the batch of mutations to the server."""
for row in self._row_map.values():
# commit() does nothing if row hasn't accumulated any mutations.
row.commit()
self._row_map.clear()
self._mutation_count = 0
def _try_send(self):
"""Send / commit the batch if mutations have exceeded batch size."""
if self._batch_size and self._mutation_count >= self._batch_size:
self.send()
def _get_row(self, row_key):
"""Gets a row that will hold mutations.
If the row is not already cached on the current batch, a new row will
be created.
:type row_key: str
:param row_key: The row key for a row stored in the map.
:rtype: :class:`Row <gcloud.bigtable.row.Row>`
:returns: The newly created or stored row that will hold mutations.
"""
if row_key not in self._row_map:
table = self._table._low_level_table
self._row_map[row_key] = table.row(row_key)
return self._row_map[row_key]
def put(self, row, data, wal=_WAL_SENTINEL):
"""Insert data into a row in the table owned by this batch.
:type row: str
:param row: The row key where the mutation will be "put".
:type data: dict
:param data: Dictionary containing the data to be inserted. The keys
are columns names (of the form ``fam:col``) and the values
are strings (bytes) to be stored in those columns.
:type wal: object
:param wal: Unused parameter (to over-ride the default on the
instance). Provided for compatibility with HappyBase, but
irrelevant for Cloud Bigtable since it does not have a
Write Ahead Log.
"""
if wal is not _WAL_SENTINEL:
_WARN(_WAL_WARNING)
row_object = self._get_row(row)
# Make sure all the keys are valid before beginning
# to add mutations.
column_pairs = _get_column_pairs(six.iterkeys(data),
require_qualifier=True)
for column_family_id, column_qualifier in column_pairs:
value = data[column_family_id + ':' + column_qualifier]
row_object.set_cell(column_family_id, column_qualifier,
value, timestamp=self._timestamp)
self._mutation_count += len(data)
self._try_send()
def _delete_columns(self, columns, row_object):
"""Adds delete mutations for a list of columns and column families.
:type columns: list
:param columns: Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:type row_object: :class:`Row <gcloud_bigtable.row.Row>`
:param row_object: The row which will hold the delete mutations.
:raises: :class:`ValueError <exceptions.ValueError>` if the delete
timestamp range is set on the current batch, but a
column family delete is attempted.
"""
column_pairs = _get_column_pairs(columns)
for column_family_id, column_qualifier in column_pairs:
if column_qualifier is None:
if self._delete_range is not None:
raise ValueError('The Cloud Bigtable API does not support '
'adding a timestamp to '
'"DeleteFromFamily" ')
row_object.delete_cells(column_family_id,
columns=row_object.ALL_COLUMNS)
else:
row_object.delete_cell(column_family_id,
column_qualifier,
time_range=self._delete_range)
def delete(self, row, columns=None, wal=_WAL_SENTINEL):
"""Delete data from a row in the table owned by this batch.
:type row: str
:param row: The row key where the delete will occur.
:type columns: list
:param columns: (Optional) Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
If not used, will delete the entire row.
:type wal: object
:param wal: Unused parameter (to over-ride the default on the
instance). Provided for compatibility with HappyBase, but
irrelevant for Cloud Bigtable since it does not have a
Write Ahead Log.
:raises: If the delete timestamp range is set on the
current batch, but a full row delete is attempted.
"""
if wal is not _WAL_SENTINEL:
_WARN(_WAL_WARNING)
row_object = self._get_row(row)
if columns is None:
# Delete entire row.
if self._delete_range is not None:
raise ValueError('The Cloud Bigtable API does not support '
'adding a timestamp to "DeleteFromRow" '
'mutations')
row_object.delete()
self._mutation_count += 1
else:
self._delete_columns(columns, row_object)
self._mutation_count += len(columns)
self._try_send()
def __enter__(self):
"""Enter context manager, no set-up required."""
return self
def __exit__(self, exc_type, exc_value, traceback):
"""Exit context manager, no set-up required.
:type exc_type: type
:param exc_type: The type of the exception if one occurred while the
context manager was active. Otherwise, :data:`None`.
:type exc_value: :class:`Exception <exceptions.Exception>`
:param exc_value: An instance of ``exc_type`` if an exception occurred
while the context was active.
Otherwise, :data:`None`.
:type traceback: ``traceback`` type
:param traceback: The traceback where the exception occurred (if one
did occur). Otherwise, :data:`None`.
"""
# If the context manager encountered an exception and the batch is
# transactional, we don't commit the mutations.
if self._transaction and exc_type is not None:
return
# NOTE: For non-transactional batches, this will even commit mutations
# if an error occurred during the context manager.
self.send()
def _get_column_pairs(columns, require_qualifier=False):
"""Turns a list of column or column families into parsed pairs.
Turns a column family (``fam`` or ``fam:``) into a pair such
as ``['fam', None]`` and turns a column (``fam:col``) into
``['fam', 'col']``.
:type columns: list
:param columns: Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:type require_qualifier: bool
:param require_qualifier: Boolean indicating if the columns should
all have a qualifier or not.
:rtype: list
:returns: List of pairs, where the first element in each pair is the
column family and the second is the column qualifier
(or :data:`None`).
:raises: :class:`ValueError <exceptions.ValueError>` if any of the columns
are not of the expected format.
:class:`ValueError <exceptions.ValueError>` if
``require_qualifier`` is :data:`True` and one of the values is
for an entire column family
"""
column_pairs = []
for column in columns:
if isinstance(column, six.binary_type):
column = column.decode('utf-8')
# Remove trailing colons (i.e. for standalone column family).
if column.endswith(u':'):
column = column[:-1]
num_colons = column.count(u':')
if num_colons == 0:
# column is a column family.
if require_qualifier:
raise ValueError('column does not contain a qualifier',
column)
else:
column_pairs.append([column, None])
elif num_colons == 1:
column_pairs.append(column.split(u':'))
else:
raise ValueError('Column contains the : separator more than once')
return column_pairs

View file

@ -0,0 +1,484 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Bigtable HappyBase connection module."""
import datetime
import warnings
import six
from grpc.beta import interfaces
from grpc.framework.interfaces.face import face
try:
from happybase.hbase.ttypes import AlreadyExists
except ImportError:
from gcloud.exceptions import Conflict as AlreadyExists
from gcloud.bigtable.client import Client
from gcloud.bigtable.column_family import GCRuleIntersection
from gcloud.bigtable.column_family import MaxAgeGCRule
from gcloud.bigtable.column_family import MaxVersionsGCRule
from gcloud.bigtable.happybase.table import Table
from gcloud.bigtable.table import Table as _LowLevelTable
# Constants reproduced here for HappyBase compatibility, though values
# are all null.
COMPAT_MODES = None
THRIFT_TRANSPORTS = None
THRIFT_PROTOCOLS = None
DEFAULT_HOST = None
DEFAULT_PORT = None
DEFAULT_TRANSPORT = None
DEFAULT_COMPAT = None
DEFAULT_PROTOCOL = None
_LEGACY_ARGS = frozenset(('host', 'port', 'compat', 'transport', 'protocol'))
_WARN = warnings.warn
_DISABLE_DELETE_MSG = ('The disable argument should not be used in '
'delete_table(). Cloud Bigtable has no concept '
'of enabled / disabled tables.')
def _get_instance(timeout=None):
"""Gets instance for the default project.
Creates a client with the inferred credentials and project ID from
the local environment. Then uses
:meth:`.bigtable.client.Client.list_instances` to
get the unique instance owned by the project.
If the request fails for any reason, or if there isn't exactly one instance
owned by the project, then this function will fail.
:type timeout: int
:param timeout: (Optional) The socket timeout in milliseconds.
:rtype: :class:`gcloud.bigtable.instance.Instance`
:returns: The unique instance owned by the project inferred from
the environment.
:raises: :class:`ValueError <exceptions.ValueError>` if there is a failed
location or any number of instances other than one.
"""
client_kwargs = {'admin': True}
if timeout is not None:
client_kwargs['timeout_seconds'] = timeout / 1000.0
client = Client(**client_kwargs)
try:
client.start()
instances, failed_locations = client.list_instances()
finally:
client.stop()
if len(failed_locations) != 0:
raise ValueError('Determining instance via ListInstances encountered '
'failed locations.')
if len(instances) == 0:
raise ValueError('This client doesn\'t have access to any instances.')
if len(instances) > 1:
raise ValueError('This client has access to more than one instance. '
'Please directly pass the instance you\'d '
'like to use.')
return instances[0]
class Connection(object):
"""Connection to Cloud Bigtable backend.
.. note::
If you pass a ``instance``, it will be :meth:`.Instance.copy`-ed before
being stored on the new connection. This also copies the
:class:`Client <gcloud.bigtable.client.Client>` that created the
:class:`Instance <gcloud.bigtable.instance.Instance>` instance and the
:class:`Credentials <oauth2client.client.Credentials>` stored on the
client.
The arguments ``host``, ``port``, ``compat``, ``transport`` and
``protocol`` are allowed (as keyword arguments) for compatibility with
HappyBase. However, they will not be used in any way, and will cause a
warning if passed.
:type timeout: int
:param timeout: (Optional) The socket timeout in milliseconds.
:type autoconnect: bool
:param autoconnect: (Optional) Whether the connection should be
:meth:`open`-ed during construction.
:type table_prefix: str
:param table_prefix: (Optional) Prefix used to construct table names.
:type table_prefix_separator: str
:param table_prefix_separator: (Optional) Separator used with
``table_prefix``. Defaults to ``_``.
:type instance: :class:`Instance <gcloud.bigtable.instance.Instance>`
:param instance: (Optional) A Cloud Bigtable instance. The instance also
owns a client for making gRPC requests to the Cloud
Bigtable API. If not passed in, defaults to creating client
with ``admin=True`` and using the ``timeout`` here for the
``timeout_seconds`` argument to the
:class:`Client <gcloud.bigtable.client.Client>`
constructor. The credentials for the client
will be the implicit ones loaded from the environment.
Then that client is used to retrieve all the instances
owned by the client's project.
:type kwargs: dict
:param kwargs: Remaining keyword arguments. Provided for HappyBase
compatibility.
"""
_instance = None
def __init__(self, timeout=None, autoconnect=True, table_prefix=None,
table_prefix_separator='_', instance=None, **kwargs):
self._handle_legacy_args(kwargs)
if table_prefix is not None:
if not isinstance(table_prefix, six.string_types):
raise TypeError('table_prefix must be a string', 'received',
table_prefix, type(table_prefix))
if not isinstance(table_prefix_separator, six.string_types):
raise TypeError('table_prefix_separator must be a string',
'received', table_prefix_separator,
type(table_prefix_separator))
self.table_prefix = table_prefix
self.table_prefix_separator = table_prefix_separator
if instance is None:
self._instance = _get_instance(timeout=timeout)
else:
if timeout is not None:
raise ValueError('Timeout cannot be used when an existing '
'instance is passed')
self._instance = instance.copy()
if autoconnect:
self.open()
self._initialized = True
@staticmethod
def _handle_legacy_args(arguments_dict):
"""Check legacy HappyBase arguments and warn if set.
:type arguments_dict: dict
:param arguments_dict: Unused keyword arguments.
:raises: :class:`TypeError <exceptions.TypeError>` if a keyword other
than ``host``, ``port``, ``compat``, ``transport`` or
``protocol`` is used.
"""
common_args = _LEGACY_ARGS.intersection(six.iterkeys(arguments_dict))
if common_args:
all_args = ', '.join(common_args)
message = ('The HappyBase legacy arguments %s were used. These '
'arguments are unused by gcloud.' % (all_args,))
_WARN(message)
for arg_name in common_args:
arguments_dict.pop(arg_name)
if arguments_dict:
unexpected_names = arguments_dict.keys()
raise TypeError('Received unexpected arguments', unexpected_names)
def open(self):
"""Open the underlying transport to Cloud Bigtable.
This method opens the underlying HTTP/2 gRPC connection using a
:class:`Client <gcloud.bigtable.client.Client>` bound to the
:class:`Instance <gcloud.bigtable.instance.Instance>` owned by
this connection.
"""
self._instance._client.start()
def close(self):
"""Close the underlying transport to Cloud Bigtable.
This method closes the underlying HTTP/2 gRPC connection using a
:class:`Client <gcloud.bigtable.client.Client>` bound to the
:class:`Instance <gcloud.bigtable.instance.Instance>` owned by
this connection.
"""
self._instance._client.stop()
def __del__(self):
if self._instance is not None:
self.close()
def _table_name(self, name):
"""Construct a table name by optionally adding a table name prefix.
:type name: str
:param name: The name to have a prefix added to it.
:rtype: str
:returns: The prefixed name, if the current connection has a table
prefix set.
"""
if self.table_prefix is None:
return name
return self.table_prefix + self.table_prefix_separator + name
def table(self, name, use_prefix=True):
"""Table factory.
:type name: str
:param name: The name of the table to be created.
:type use_prefix: bool
:param use_prefix: Whether to use the table prefix (if any).
:rtype: :class:`Table <gcloud.bigtable.happybase.table.Table>`
:returns: Table instance owned by this connection.
"""
if use_prefix:
name = self._table_name(name)
return Table(name, self)
def tables(self):
"""Return a list of table names available to this connection.
.. note::
This lists every table in the instance owned by this connection,
**not** every table that a given user may have access to.
.. note::
If ``table_prefix`` is set on this connection, only returns the
table names which match that prefix.
:rtype: list
:returns: List of string table names.
"""
low_level_table_instances = self._instance.list_tables()
table_names = [table_instance.table_id
for table_instance in low_level_table_instances]
# Filter using prefix, and strip prefix from names
if self.table_prefix is not None:
prefix = self._table_name('')
offset = len(prefix)
table_names = [name[offset:] for name in table_names
if name.startswith(prefix)]
return table_names
def create_table(self, name, families):
"""Create a table.
.. warning::
The only column family options from HappyBase that are able to be
used with Cloud Bigtable are ``max_versions`` and ``time_to_live``.
.. note::
This method is **not** atomic. The Cloud Bigtable API separates
the creation of a table from the creation of column families. Thus
this method needs to send 1 request for the table creation and 1
request for each column family. If any of these fails, the method
will fail, but the progress made towards completion cannot be
rolled back.
Values in ``families`` represent column family options. In HappyBase,
these are dictionaries, corresponding to the ``ColumnDescriptor``
structure in the Thrift API. The accepted keys are:
* ``max_versions`` (``int``)
* ``compression`` (``str``)
* ``in_memory`` (``bool``)
* ``bloom_filter_type`` (``str``)
* ``bloom_filter_vector_size`` (``int``)
* ``bloom_filter_nb_hashes`` (``int``)
* ``block_cache_enabled`` (``bool``)
* ``time_to_live`` (``int``)
:type name: str
:param name: The name of the table to be created.
:type families: dict
:param families: Dictionary with column family names as keys and column
family options as the values. The options can be among
* :class:`dict`
* :class:`.GarbageCollectionRule`
:raises: :class:`TypeError <exceptions.TypeError>` if ``families`` is
not a dictionary,
:class:`ValueError <exceptions.ValueError>` if ``families``
has no entries
"""
if not isinstance(families, dict):
raise TypeError('families arg must be a dictionary')
if not families:
raise ValueError('Cannot create table %r (no column '
'families specified)' % (name,))
# Parse all keys before making any API requests.
gc_rule_dict = {}
for column_family_name, option in families.items():
if isinstance(column_family_name, six.binary_type):
column_family_name = column_family_name.decode('utf-8')
if column_family_name.endswith(':'):
column_family_name = column_family_name[:-1]
gc_rule_dict[column_family_name] = _parse_family_option(option)
# Create table instance and then make API calls.
name = self._table_name(name)
low_level_table = _LowLevelTable(name, self._instance)
try:
low_level_table.create()
except face.NetworkError as network_err:
if network_err.code == interfaces.StatusCode.ALREADY_EXISTS:
raise AlreadyExists(name)
else:
raise
for column_family_name, gc_rule in gc_rule_dict.items():
column_family = low_level_table.column_family(
column_family_name, gc_rule=gc_rule)
column_family.create()
def delete_table(self, name, disable=False):
"""Delete the specified table.
:type name: str
:param name: The name of the table to be deleted. If ``table_prefix``
is set, a prefix will be added to the ``name``.
:type disable: bool
:param disable: Whether to first disable the table if needed. This
is provided for compatibility with HappyBase, but is
not relevant for Cloud Bigtable since it has no concept
of enabled / disabled tables.
"""
if disable:
_WARN(_DISABLE_DELETE_MSG)
name = self._table_name(name)
_LowLevelTable(name, self._instance).delete()
def enable_table(self, name):
"""Enable the specified table.
.. warning::
Cloud Bigtable has no concept of enabled / disabled tables so this
method does not work. It is provided simply for compatibility.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API has no concept of '
'enabled or disabled tables.')
def disable_table(self, name):
"""Disable the specified table.
.. warning::
Cloud Bigtable has no concept of enabled / disabled tables so this
method does not work. It is provided simply for compatibility.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API has no concept of '
'enabled or disabled tables.')
def is_table_enabled(self, name):
"""Return whether the specified table is enabled.
.. warning::
Cloud Bigtable has no concept of enabled / disabled tables so this
method does not work. It is provided simply for compatibility.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API has no concept of '
'enabled or disabled tables.')
def compact_table(self, name, major=False):
"""Compact the specified table.
.. warning::
Cloud Bigtable does not support compacting a table, so this
method does not work. It is provided simply for compatibility.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API does not support '
'compacting a table.')
def _parse_family_option(option):
"""Parses a column family option into a garbage collection rule.
.. note::
If ``option`` is not a dictionary, the type is not checked.
If ``option`` is :data:`None`, there is nothing to do, since this
is the correct output.
:type option: :class:`dict`,
:data:`NoneType <types.NoneType>`,
:class:`.GarbageCollectionRule`
:param option: A column family option passes as a dictionary value in
:meth:`Connection.create_table`.
:rtype: :class:`.GarbageCollectionRule`
:returns: A garbage collection rule parsed from the input.
"""
result = option
if isinstance(result, dict):
if not set(result.keys()) <= set(['max_versions', 'time_to_live']):
all_keys = ', '.join(repr(key) for key in result.keys())
warning_msg = ('Cloud Bigtable only supports max_versions and '
'time_to_live column family settings. '
'Received: %s' % (all_keys,))
_WARN(warning_msg)
max_num_versions = result.get('max_versions')
max_age = None
if 'time_to_live' in result:
max_age = datetime.timedelta(seconds=result['time_to_live'])
versions_rule = age_rule = None
if max_num_versions is not None:
versions_rule = MaxVersionsGCRule(max_num_versions)
if max_age is not None:
age_rule = MaxAgeGCRule(max_age)
if versions_rule is None:
result = age_rule
else:
if age_rule is None:
result = versions_rule
else:
result = GCRuleIntersection(rules=[age_rule, versions_rule])
return result

View file

@ -0,0 +1,153 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Bigtable HappyBase pool module."""
import contextlib
import threading
import six
from gcloud.bigtable.happybase.connection import Connection
from gcloud.bigtable.happybase.connection import _get_instance
_MIN_POOL_SIZE = 1
"""Minimum allowable size of a connection pool."""
class NoConnectionsAvailable(RuntimeError):
"""Exception raised when no connections are available.
This happens if a timeout was specified when obtaining a connection,
and no connection became available within the specified timeout.
"""
class ConnectionPool(object):
"""Thread-safe connection pool.
.. note::
All keyword arguments are passed unmodified to the
:class:`Connection <.happybase.connection.Connection>` constructor
**except** for ``autoconnect``. This is because the ``open`` /
``closed`` status of a connection is managed by the pool. In addition,
if ``instance`` is not passed, the default / inferred instance is
determined by the pool and then passed to each
:class:`Connection <.happybase.connection.Connection>` that is created.
:type size: int
:param size: The maximum number of concurrently open connections.
:type kwargs: dict
:param kwargs: Keyword arguments passed to
:class:`Connection <.happybase.Connection>`
constructor.
:raises: :class:`TypeError <exceptions.TypeError>` if ``size``
is non an integer.
:class:`ValueError <exceptions.ValueError>` if ``size``
is not positive.
"""
def __init__(self, size, **kwargs):
if not isinstance(size, six.integer_types):
raise TypeError('Pool size arg must be an integer')
if size < _MIN_POOL_SIZE:
raise ValueError('Pool size must be positive')
self._lock = threading.Lock()
self._queue = six.moves.queue.LifoQueue(maxsize=size)
self._thread_connections = threading.local()
connection_kwargs = kwargs
connection_kwargs['autoconnect'] = False
if 'instance' not in connection_kwargs:
connection_kwargs['instance'] = _get_instance(
timeout=kwargs.get('timeout'))
for _ in six.moves.range(size):
connection = Connection(**connection_kwargs)
self._queue.put(connection)
def _acquire_connection(self, timeout=None):
"""Acquire a connection from the pool.
:type timeout: int
:param timeout: (Optional) Time (in seconds) to wait for a connection
to open.
:rtype: :class:`Connection <.happybase.Connection>`
:returns: An active connection from the queue stored on the pool.
:raises: :class:`NoConnectionsAvailable` if ``Queue.get`` fails
before the ``timeout`` (only if a timeout is specified).
"""
try:
return self._queue.get(block=True, timeout=timeout)
except six.moves.queue.Empty:
raise NoConnectionsAvailable('No connection available from pool '
'within specified timeout')
@contextlib.contextmanager
def connection(self, timeout=None):
"""Obtain a connection from the pool.
Must be used as a context manager, for example::
with pool.connection() as connection:
pass # do something with the connection
If ``timeout`` is omitted, this method waits forever for a connection
to become available from the local queue.
:type timeout: int
:param timeout: (Optional) Time (in seconds) to wait for a connection
to open.
:rtype: :class:`Connection <.happybase.connection.Connection>`
:returns: An active connection from the pool.
:raises: :class:`NoConnectionsAvailable` if no connection can be
retrieved from the pool before the ``timeout`` (only if
a timeout is specified).
"""
connection = getattr(self._thread_connections, 'current', None)
retrieved_new_cnxn = False
if connection is None:
# In this case we need to actually grab a connection from the
# pool. After retrieval, the connection is stored on a thread
# local so that nested connection requests from the same
# thread can re-use the same connection instance.
#
# NOTE: This code acquires a lock before assigning to the
# thread local; see
# ('https://emptysqua.re/blog/'
# 'another-thing-about-pythons-threadlocals/')
retrieved_new_cnxn = True
connection = self._acquire_connection(timeout)
with self._lock:
self._thread_connections.current = connection
# This is a no-op for connections that have already been opened
# since they just call Client.start().
connection.open()
yield connection
# Remove thread local reference after the outermost 'with' block
# ends. Afterwards the thread no longer owns the connection.
if retrieved_new_cnxn:
del self._thread_connections.current
self._queue.put(connection)

View file

@ -0,0 +1,980 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Google Cloud Bigtable HappyBase table module."""
import struct
import warnings
import six
from gcloud._helpers import _datetime_from_microseconds
from gcloud._helpers import _microseconds_from_datetime
from gcloud._helpers import _to_bytes
from gcloud._helpers import _total_seconds
from gcloud.bigtable.column_family import GCRuleIntersection
from gcloud.bigtable.column_family import MaxAgeGCRule
from gcloud.bigtable.column_family import MaxVersionsGCRule
from gcloud.bigtable.happybase.batch import _get_column_pairs
from gcloud.bigtable.happybase.batch import _WAL_SENTINEL
from gcloud.bigtable.happybase.batch import Batch
from gcloud.bigtable.row_filters import CellsColumnLimitFilter
from gcloud.bigtable.row_filters import ColumnQualifierRegexFilter
from gcloud.bigtable.row_filters import FamilyNameRegexFilter
from gcloud.bigtable.row_filters import RowFilterChain
from gcloud.bigtable.row_filters import RowFilterUnion
from gcloud.bigtable.row_filters import RowKeyRegexFilter
from gcloud.bigtable.row_filters import TimestampRange
from gcloud.bigtable.row_filters import TimestampRangeFilter
from gcloud.bigtable.table import Table as _LowLevelTable
_WARN = warnings.warn
_UNPACK_I64 = struct.Struct('>q').unpack
_SIMPLE_GC_RULES = (MaxAgeGCRule, MaxVersionsGCRule)
def make_row(cell_map, include_timestamp):
"""Make a row dict for a Thrift cell mapping.
.. warning::
This method is only provided for HappyBase compatibility, but does not
actually work.
:type cell_map: dict
:param cell_map: Dictionary with ``fam:col`` strings as keys and ``TCell``
instances as values.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API output is not the same '
'as the output from the Thrift server, so this '
'helper can not be implemented.', 'Called with',
cell_map, include_timestamp)
def make_ordered_row(sorted_columns, include_timestamp):
"""Make a row dict for sorted Thrift column results from scans.
.. warning::
This method is only provided for HappyBase compatibility, but does not
actually work.
:type sorted_columns: list
:param sorted_columns: List of ``TColumn`` instances from Thrift.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API output is not the same '
'as the output from the Thrift server, so this '
'helper can not be implemented.', 'Called with',
sorted_columns, include_timestamp)
class Table(object):
"""Representation of Cloud Bigtable table.
Used for adding data and
:type name: str
:param name: The name of the table.
:type connection: :class:`Connection <.happybase.connection.Connection>`
:param connection: The connection which has access to the table.
"""
def __init__(self, name, connection):
self.name = name
# This remains as legacy for HappyBase, but only the instance
# from the connection is needed.
self.connection = connection
self._low_level_table = None
if self.connection is not None:
self._low_level_table = _LowLevelTable(self.name,
self.connection._instance)
def __repr__(self):
return '<table.Table name=%r>' % (self.name,)
def families(self):
"""Retrieve the column families for this table.
:rtype: dict
:returns: Mapping from column family name to garbage collection rule
for a column family.
"""
column_family_map = self._low_level_table.list_column_families()
result = {}
for col_fam, col_fam_obj in six.iteritems(column_family_map):
result[col_fam] = _gc_rule_to_dict(col_fam_obj.gc_rule)
return result
def regions(self):
"""Retrieve the regions for this table.
.. warning::
Cloud Bigtable does not give information about how a table is laid
out in memory, so this method does not work. It is
provided simply for compatibility.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('The Cloud Bigtable API does not have a '
'concept of splitting a table into regions.')
def row(self, row, columns=None, timestamp=None, include_timestamp=False):
"""Retrieve a single row of data.
Returns the latest cells in each column (or all columns if ``columns``
is not specified). If a ``timestamp`` is set, then **latest** becomes
**latest** up until ``timestamp``.
:type row: str
:param row: Row key for the row we are reading from.
:type columns: list
:param columns: (Optional) Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch). If specified, only cells returned before the
the timestamp will be returned.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:rtype: dict
:returns: Dictionary containing all the latest column values in
the row.
"""
filters = []
if columns is not None:
filters.append(_columns_filter_helper(columns))
# versions == 1 since we only want the latest.
filter_ = _filter_chain_helper(versions=1, timestamp=timestamp,
filters=filters)
partial_row_data = self._low_level_table.read_row(
row, filter_=filter_)
if partial_row_data is None:
return {}
return _partial_row_to_dict(partial_row_data,
include_timestamp=include_timestamp)
def rows(self, rows, columns=None, timestamp=None,
include_timestamp=False):
"""Retrieve multiple rows of data.
All optional arguments behave the same in this method as they do in
:meth:`row`.
:type rows: list
:param rows: Iterable of the row keys for the rows we are reading from.
:type columns: list
:param columns: (Optional) Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch). If specified, only cells returned before (or
at) the timestamp will be returned.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:rtype: list
:returns: A list of pairs, where the first is the row key and the
second is a dictionary with the filtered values returned.
"""
if not rows:
# Avoid round-trip if the result is empty anyway
return []
filters = []
if columns is not None:
filters.append(_columns_filter_helper(columns))
filters.append(_row_keys_filter_helper(rows))
# versions == 1 since we only want the latest.
filter_ = _filter_chain_helper(versions=1, timestamp=timestamp,
filters=filters)
partial_rows_data = self._low_level_table.read_rows(filter_=filter_)
# NOTE: We could use max_loops = 1000 or some similar value to ensure
# that the stream isn't open too long.
partial_rows_data.consume_all()
result = []
for row_key in rows:
if row_key not in partial_rows_data.rows:
continue
curr_row_data = partial_rows_data.rows[row_key]
curr_row_dict = _partial_row_to_dict(
curr_row_data, include_timestamp=include_timestamp)
result.append((row_key, curr_row_dict))
return result
def cells(self, row, column, versions=None, timestamp=None,
include_timestamp=False):
"""Retrieve multiple versions of a single cell from the table.
:type row: str
:param row: Row key for the row we are reading from.
:type column: str
:param column: Column we are reading from; of the form ``fam:col``.
:type versions: int
:param versions: (Optional) The maximum number of cells to return. If
not set, returns all cells found.
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch). If specified, only cells returned before (or
at) the timestamp will be returned.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:rtype: list
:returns: List of values in the cell (with timestamps if
``include_timestamp`` is :data:`True`).
"""
filter_ = _filter_chain_helper(column=column, versions=versions,
timestamp=timestamp)
partial_row_data = self._low_level_table.read_row(row, filter_=filter_)
if partial_row_data is None:
return []
else:
cells = partial_row_data._cells
# We know that `_filter_chain_helper` has already verified that
# column will split as such.
column_family_id, column_qualifier = column.split(':')
# NOTE: We expect the only key in `cells` is `column_family_id`
# and the only key `cells[column_family_id]` is
# `column_qualifier`. But we don't check that this is true.
curr_cells = cells[column_family_id][column_qualifier]
return _cells_to_pairs(
curr_cells, include_timestamp=include_timestamp)
def scan(self, row_start=None, row_stop=None, row_prefix=None,
columns=None, timestamp=None,
include_timestamp=False, limit=None, **kwargs):
"""Create a scanner for data in this table.
This method returns a generator that can be used for looping over the
matching rows.
If ``row_prefix`` is specified, only rows with row keys matching the
prefix will be returned. If given, ``row_start`` and ``row_stop``
cannot be used.
.. note::
Both ``row_start`` and ``row_stop`` can be :data:`None` to specify
the start and the end of the table respectively. If both are
omitted, a full table scan is done. Note that this usually results
in severe performance problems.
The keyword argument ``filter`` is also supported (beyond column and
row range filters supported here). HappyBase / HBase users will have
used this as an HBase filter string. (See the `Thrift docs`_ for more
details on those filters.) However, Google Cloud Bigtable doesn't
support those filter strings so a
:class:`~gcloud.bigtable.row.RowFilter` should be used instead.
.. _Thrift docs: http://hbase.apache.org/0.94/book/thrift.html
The arguments ``batch_size``, ``scan_batching`` and ``sorted_columns``
are allowed (as keyword arguments) for compatibility with
HappyBase. However, they will not be used in any way, and will cause a
warning if passed. (The ``batch_size`` determines the number of
results to retrieve per request. The HBase scanner defaults to reading
one record at a time, so this argument allows HappyBase to increase
that number. However, the Cloud Bigtable API uses HTTP/2 streaming so
there is no concept of a batched scan. The ``sorted_columns`` flag
tells HBase to return columns in order, but Cloud Bigtable doesn't
have this feature.)
:type row_start: str
:param row_start: (Optional) Row key where the scanner should start
(includes ``row_start``). If not specified, reads
from the first key. If the table does not contain
``row_start``, it will start from the next key after
it that **is** contained in the table.
:type row_stop: str
:param row_stop: (Optional) Row key where the scanner should stop
(excludes ``row_stop``). If not specified, reads
until the last key. The table does not have to contain
``row_stop``.
:type row_prefix: str
:param row_prefix: (Optional) Prefix to match row keys.
:type columns: list
:param columns: (Optional) Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch). If specified, only cells returned before (or
at) the timestamp will be returned.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:type limit: int
:param limit: (Optional) Maximum number of rows to return.
:type kwargs: dict
:param kwargs: Remaining keyword arguments. Provided for HappyBase
compatibility.
:raises: If ``limit`` is set but non-positive, or if ``row_prefix`` is
used with row start/stop,
:class:`TypeError <exceptions.TypeError>` if a string
``filter`` is used.
"""
row_start, row_stop, filter_chain = _scan_filter_helper(
row_start, row_stop, row_prefix, columns, timestamp, limit, kwargs)
partial_rows_data = self._low_level_table.read_rows(
start_key=row_start, end_key=row_stop,
limit=limit, filter_=filter_chain)
# Mutable copy of data.
rows_dict = partial_rows_data.rows
while True:
try:
partial_rows_data.consume_next()
for row_key in sorted(rows_dict):
curr_row_data = rows_dict.pop(row_key)
# NOTE: We expect len(rows_dict) == 0, but don't check it.
curr_row_dict = _partial_row_to_dict(
curr_row_data, include_timestamp=include_timestamp)
yield (row_key, curr_row_dict)
except StopIteration:
break
def put(self, row, data, timestamp=None, wal=_WAL_SENTINEL):
"""Insert data into a row in this table.
.. note::
This method will send a request with a single "put" mutation.
In many situations, :meth:`batch` is a more appropriate
method to manipulate data since it helps combine many mutations
into a single request.
:type row: str
:param row: The row key where the mutation will be "put".
:type data: dict
:param data: Dictionary containing the data to be inserted. The keys
are columns names (of the form ``fam:col``) and the values
are strings (bytes) to be stored in those columns.
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch) that the mutation will be applied at.
:type wal: object
:param wal: Unused parameter (to be passed to a created batch).
Provided for compatibility with HappyBase, but irrelevant
for Cloud Bigtable since it does not have a Write Ahead
Log.
"""
with self.batch(timestamp=timestamp, wal=wal) as batch:
batch.put(row, data)
def delete(self, row, columns=None, timestamp=None, wal=_WAL_SENTINEL):
"""Delete data from a row in this table.
This method deletes the entire ``row`` if ``columns`` is not
specified.
.. note::
This method will send a request with a single delete mutation.
In many situations, :meth:`batch` is a more appropriate
method to manipulate data since it helps combine many mutations
into a single request.
:type row: str
:param row: The row key where the delete will occur.
:type columns: list
:param columns: (Optional) Iterable containing column names (as
strings). Each column name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch) that the mutation will be applied at.
:type wal: object
:param wal: Unused parameter (to be passed to a created batch).
Provided for compatibility with HappyBase, but irrelevant
for Cloud Bigtable since it does not have a Write Ahead
Log.
"""
with self.batch(timestamp=timestamp, wal=wal) as batch:
batch.delete(row, columns)
def batch(self, timestamp=None, batch_size=None, transaction=False,
wal=_WAL_SENTINEL):
"""Create a new batch operation for this table.
This method returns a new
:class:`Batch <.happybase.batch.Batch>` instance that can be
used for mass data manipulation.
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch) that all mutations will be applied at.
:type batch_size: int
:param batch_size: (Optional) The maximum number of mutations to allow
to accumulate before committing them.
:type transaction: bool
:param transaction: Flag indicating if the mutations should be sent
transactionally or not. If ``transaction=True`` and
an error occurs while a
:class:`Batch <.happybase.batch.Batch>` is
active, then none of the accumulated mutations will
be committed. If ``batch_size`` is set, the
mutation can't be transactional.
:type wal: object
:param wal: Unused parameter (to be passed to the created batch).
Provided for compatibility with HappyBase, but irrelevant
for Cloud Bigtable since it does not have a Write Ahead
Log.
:rtype: :class:`Batch <gcloud.bigtable.happybase.batch.Batch>`
:returns: A batch bound to this table.
"""
return Batch(self, timestamp=timestamp, batch_size=batch_size,
transaction=transaction, wal=wal)
def counter_get(self, row, column):
"""Retrieve the current value of a counter column.
This method retrieves the current value of a counter column. If the
counter column does not exist, this function initializes it to ``0``.
.. note::
Application code should **never** store a counter value directly;
use the atomic :meth:`counter_inc` and :meth:`counter_dec` methods
for that.
:type row: str
:param row: Row key for the row we are getting a counter from.
:type column: str
:param column: Column we are ``get``-ing from; of the form ``fam:col``.
:rtype: int
:returns: Counter value (after initializing / incrementing by 0).
"""
# Don't query directly, but increment with value=0 so that the counter
# is correctly initialized if didn't exist yet.
return self.counter_inc(row, column, value=0)
def counter_set(self, row, column, value=0):
"""Set a counter column to a specific value.
This method is provided in HappyBase, but we do not provide it here
because it defeats the purpose of using atomic increment and decrement
of a counter.
:type row: str
:param row: Row key for the row we are setting a counter in.
:type column: str
:param column: Column we are setting a value in; of
the form ``fam:col``.
:type value: int
:param value: Value to set the counter to.
:raises: :class:`NotImplementedError <exceptions.NotImplementedError>`
always
"""
raise NotImplementedError('Table.counter_set will not be implemented. '
'Instead use the increment/decrement '
'methods along with counter_get.')
def counter_inc(self, row, column, value=1):
"""Atomically increment a counter column.
This method atomically increments a counter column in ``row``.
If the counter column does not exist, it is automatically initialized
to ``0`` before being incremented.
:type row: str
:param row: Row key for the row we are incrementing a counter in.
:type column: str
:param column: Column we are incrementing a value in; of the
form ``fam:col``.
:type value: int
:param value: Amount to increment the counter by. (If negative,
this is equivalent to decrement.)
:rtype: int
:returns: Counter value after incrementing.
"""
row = self._low_level_table.row(row, append=True)
if isinstance(column, six.binary_type):
column = column.decode('utf-8')
column_family_id, column_qualifier = column.split(':')
row.increment_cell_value(column_family_id, column_qualifier, value)
# See AppendRow.commit() will return a dictionary:
# {
# u'col-fam-id': {
# b'col-name1': [
# (b'cell-val', datetime.datetime(...)),
# ...
# ],
# ...
# },
# }
modified_cells = row.commit()
# Get the cells in the modified column,
column_cells = modified_cells[column_family_id][column_qualifier]
# Make sure there is exactly one cell in the column.
if len(column_cells) != 1:
raise ValueError('Expected server to return one modified cell.')
column_cell = column_cells[0]
# Get the bytes value from the column and convert it to an integer.
bytes_value = column_cell[0]
int_value, = _UNPACK_I64(bytes_value)
return int_value
def counter_dec(self, row, column, value=1):
"""Atomically decrement a counter column.
This method atomically decrements a counter column in ``row``.
If the counter column does not exist, it is automatically initialized
to ``0`` before being decremented.
:type row: str
:param row: Row key for the row we are decrementing a counter in.
:type column: str
:param column: Column we are decrementing a value in; of the
form ``fam:col``.
:type value: int
:param value: Amount to decrement the counter by. (If negative,
this is equivalent to increment.)
:rtype: int
:returns: Counter value after decrementing.
"""
return self.counter_inc(row, column, -value)
def _gc_rule_to_dict(gc_rule):
"""Converts garbage collection rule to dictionary if possible.
This is in place to support dictionary values as was done
in HappyBase, which has somewhat different garbage collection rule
settings for column families.
Only does this if the garbage collection rule is:
* :class:`gcloud.bigtable.column_family.MaxAgeGCRule`
* :class:`gcloud.bigtable.column_family.MaxVersionsGCRule`
* Composite :class:`gcloud.bigtable.column_family.GCRuleIntersection`
with two rules, one each of type
:class:`gcloud.bigtable.column_family.MaxAgeGCRule` and
:class:`gcloud.bigtable.column_family.MaxVersionsGCRule`
Otherwise, just returns the input without change.
:type gc_rule: :data:`NoneType <types.NoneType>`,
:class:`.GarbageCollectionRule`
:param gc_rule: A garbage collection rule to convert to a dictionary
(if possible).
:rtype: dict or
:class:`gcloud.bigtable.column_family.GarbageCollectionRule`
:returns: The converted garbage collection rule.
"""
result = gc_rule
if gc_rule is None:
result = {}
elif isinstance(gc_rule, MaxAgeGCRule):
result = {'time_to_live': _total_seconds(gc_rule.max_age)}
elif isinstance(gc_rule, MaxVersionsGCRule):
result = {'max_versions': gc_rule.max_num_versions}
elif isinstance(gc_rule, GCRuleIntersection):
if len(gc_rule.rules) == 2:
rule1, rule2 = gc_rule.rules
if (isinstance(rule1, _SIMPLE_GC_RULES) and
isinstance(rule2, _SIMPLE_GC_RULES)):
rule1 = _gc_rule_to_dict(rule1)
rule2 = _gc_rule_to_dict(rule2)
key1, = rule1.keys()
key2, = rule2.keys()
if key1 != key2:
result = {key1: rule1[key1], key2: rule2[key2]}
return result
def _next_char(str_val, index):
"""Gets the next character based on a position in a string.
:type str_val: str
:param str_val: A string containing the character to update.
:type index: int
:param index: An integer index in ``str_val``.
:rtype: str
:returns: The next character after the character at ``index``
in ``str_val``.
"""
ord_val = six.indexbytes(str_val, index)
return _to_bytes(chr(ord_val + 1), encoding='latin-1')
def _string_successor(str_val):
"""Increment and truncate a byte string.
Determines shortest string that sorts after the given string when
compared using regular string comparison semantics.
Modeled after implementation in ``gcloud-golang``.
Increments the last byte that is smaller than ``0xFF``, and
drops everything after it. If the string only contains ``0xFF`` bytes,
``''`` is returned.
:type str_val: str
:param str_val: String to increment.
:rtype: str
:returns: The next string in lexical order after ``str_val``.
"""
str_val = _to_bytes(str_val, encoding='latin-1')
if str_val == b'':
return str_val
index = len(str_val) - 1
while index >= 0:
if six.indexbytes(str_val, index) != 0xff:
break
index -= 1
if index == -1:
return b''
return str_val[:index] + _next_char(str_val, index)
def _convert_to_time_range(timestamp=None):
"""Create a timestamp range from an HBase / HappyBase timestamp.
HBase uses timestamp as an argument to specify an exclusive end
deadline. Cloud Bigtable also uses exclusive end times, so
the behavior matches.
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch). Intended to be used as the end of an HBase
time range, which is exclusive.
:rtype: :class:`gcloud.bigtable.row.TimestampRange`,
:data:`NoneType <types.NoneType>`
:returns: The timestamp range corresponding to the passed in
``timestamp``.
"""
if timestamp is None:
return None
next_timestamp = _datetime_from_microseconds(1000 * timestamp)
return TimestampRange(end=next_timestamp)
def _cells_to_pairs(cells, include_timestamp=False):
"""Converts list of cells to HappyBase format.
For example::
>>> import datetime
>>> from gcloud.bigtable.row_data import Cell
>>> cell1 = Cell(b'val1', datetime.datetime.utcnow())
>>> cell2 = Cell(b'val2', datetime.datetime.utcnow())
>>> _cells_to_pairs([cell1, cell2])
[b'val1', b'val2']
>>> _cells_to_pairs([cell1, cell2], include_timestamp=True)
[(b'val1', 1456361486255), (b'val2', 1456361491927)]
:type cells: list
:param cells: List of :class:`gcloud.bigtable.row_data.Cell` returned
from a read request.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:rtype: list
:returns: List of values in the cell. If ``include_timestamp=True``, each
value will be a pair, with the first part the bytes value in
the cell and the second part the number of milliseconds in the
timestamp on the cell.
"""
result = []
for cell in cells:
if include_timestamp:
ts_millis = _microseconds_from_datetime(cell.timestamp) // 1000
result.append((cell.value, ts_millis))
else:
result.append(cell.value)
return result
def _partial_row_to_dict(partial_row_data, include_timestamp=False):
"""Convert a low-level row data object to a dictionary.
Assumes only the latest value in each row is needed. This assumption
is due to the fact that this method is used by callers which use
a ``CellsColumnLimitFilter(1)`` filter.
For example::
>>> import datetime
>>> from gcloud.bigtable.row_data import Cell, PartialRowData
>>> cell1 = Cell(b'val1', datetime.datetime.utcnow())
>>> cell2 = Cell(b'val2', datetime.datetime.utcnow())
>>> row_data = PartialRowData(b'row-key')
>>> _partial_row_to_dict(row_data)
{}
>>> row_data._cells[u'fam1'] = {b'col1': [cell1], b'col2': [cell2]}
>>> _partial_row_to_dict(row_data)
{b'fam1:col2': b'val2', b'fam1:col1': b'val1'}
>>> _partial_row_to_dict(row_data, include_timestamp=True)
{b'fam1:col2': (b'val2', 1456361724480),
b'fam1:col1': (b'val1', 1456361721135)}
:type partial_row_data: :class:`.row_data.PartialRowData`
:param partial_row_data: Row data consumed from a stream.
:type include_timestamp: bool
:param include_timestamp: Flag to indicate if cell timestamps should be
included with the output.
:rtype: dict
:returns: The row data converted to a dictionary.
"""
result = {}
for column, cells in six.iteritems(partial_row_data.to_dict()):
cell_vals = _cells_to_pairs(cells,
include_timestamp=include_timestamp)
# NOTE: We assume there is exactly 1 version since we used that in
# our filter, but we don't check this.
result[column] = cell_vals[0]
return result
def _filter_chain_helper(column=None, versions=None, timestamp=None,
filters=None):
"""Create filter chain to limit a results set.
:type column: str
:param column: (Optional) The column (``fam:col``) to be selected
with the filter.
:type versions: int
:param versions: (Optional) The maximum number of cells to return.
:type timestamp: int
:param timestamp: (Optional) Timestamp (in milliseconds since the
epoch). If specified, only cells returned before (or
at) the timestamp will be matched.
:type filters: list
:param filters: (Optional) List of existing filters to be extended.
:rtype: :class:`RowFilter <gcloud.bigtable.row.RowFilter>`
:returns: The chained filter created, or just a single filter if only
one was needed.
:raises: :class:`ValueError <exceptions.ValueError>` if there are no
filters to chain.
"""
if filters is None:
filters = []
if column is not None:
if isinstance(column, six.binary_type):
column = column.decode('utf-8')
column_family_id, column_qualifier = column.split(':')
fam_filter = FamilyNameRegexFilter(column_family_id)
qual_filter = ColumnQualifierRegexFilter(column_qualifier)
filters.extend([fam_filter, qual_filter])
if versions is not None:
filters.append(CellsColumnLimitFilter(versions))
time_range = _convert_to_time_range(timestamp=timestamp)
if time_range is not None:
filters.append(TimestampRangeFilter(time_range))
num_filters = len(filters)
if num_filters == 0:
raise ValueError('Must have at least one filter.')
elif num_filters == 1:
return filters[0]
else:
return RowFilterChain(filters=filters)
def _scan_filter_helper(row_start, row_stop, row_prefix, columns,
timestamp, limit, kwargs):
"""Helper for :meth:`scan`: build up a filter chain."""
filter_ = kwargs.pop('filter', None)
legacy_args = []
for kw_name in ('batch_size', 'scan_batching', 'sorted_columns'):
if kw_name in kwargs:
legacy_args.append(kw_name)
kwargs.pop(kw_name)
if legacy_args:
legacy_args = ', '.join(legacy_args)
message = ('The HappyBase legacy arguments %s were used. These '
'arguments are unused by gcloud.' % (legacy_args,))
_WARN(message)
if kwargs:
raise TypeError('Received unexpected arguments', kwargs.keys())
if limit is not None and limit < 1:
raise ValueError('limit must be positive')
if row_prefix is not None:
if row_start is not None or row_stop is not None:
raise ValueError('row_prefix cannot be combined with '
'row_start or row_stop')
row_start = row_prefix
row_stop = _string_successor(row_prefix)
filters = []
if isinstance(filter_, six.string_types):
raise TypeError('Specifying filters as a string is not supported '
'by Cloud Bigtable. Use a '
'gcloud.bigtable.row.RowFilter instead.')
elif filter_ is not None:
filters.append(filter_)
if columns is not None:
filters.append(_columns_filter_helper(columns))
# versions == 1 since we only want the latest.
filter_ = _filter_chain_helper(versions=1, timestamp=timestamp,
filters=filters)
return row_start, row_stop, filter_
def _columns_filter_helper(columns):
"""Creates a union filter for a list of columns.
:type columns: list
:param columns: Iterable containing column names (as strings). Each column
name can be either
* an entire column family: ``fam`` or ``fam:``
* a single column: ``fam:col``
:rtype: :class:`RowFilter <gcloud.bigtable.row.RowFilter>`
:returns: The union filter created containing all of the matched columns.
:raises: :class:`ValueError <exceptions.ValueError>` if there are no
filters to union.
"""
filters = []
for column_family_id, column_qualifier in _get_column_pairs(columns):
fam_filter = FamilyNameRegexFilter(column_family_id)
if column_qualifier is not None:
qual_filter = ColumnQualifierRegexFilter(column_qualifier)
combined_filter = RowFilterChain(
filters=[fam_filter, qual_filter])
filters.append(combined_filter)
else:
filters.append(fam_filter)
num_filters = len(filters)
if num_filters == 0:
raise ValueError('Must have at least one filter.')
elif num_filters == 1:
return filters[0]
else:
return RowFilterUnion(filters=filters)
def _row_keys_filter_helper(row_keys):
"""Creates a union filter for a list of rows.
:type row_keys: list
:param row_keys: Iterable containing row keys (as strings).
:rtype: :class:`RowFilter <gcloud.bigtable.row.RowFilter>`
:returns: The union filter created containing all of the row keys.
:raises: :class:`ValueError <exceptions.ValueError>` if there are no
filters to union.
"""
filters = []
for row_key in row_keys:
filters.append(RowKeyRegexFilter(row_key))
num_filters = len(filters)
if num_filters == 0:
raise ValueError('Must have at least one filter.')
elif num_filters == 1:
return filters[0]
else:
return RowFilterUnion(filters=filters)

View file

@ -0,0 +1,568 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import unittest2
class _SendMixin(object):
_send_called = False
def send(self):
self._send_called = True
class TestBatch(unittest2.TestCase):
def _getTargetClass(self):
from gcloud.bigtable.happybase.batch import Batch
return Batch
def _makeOne(self, *args, **kwargs):
return self._getTargetClass()(*args, **kwargs)
def test_constructor_defaults(self):
table = object()
batch = self._makeOne(table)
self.assertEqual(batch._table, table)
self.assertEqual(batch._batch_size, None)
self.assertEqual(batch._timestamp, None)
self.assertEqual(batch._delete_range, None)
self.assertEqual(batch._transaction, False)
self.assertEqual(batch._row_map, {})
self.assertEqual(batch._mutation_count, 0)
def test_constructor_explicit(self):
from gcloud._helpers import _datetime_from_microseconds
from gcloud.bigtable.row_filters import TimestampRange
table = object()
timestamp = 144185290431
batch_size = 42
transaction = False # Must be False when batch_size is non-null
batch = self._makeOne(table, timestamp=timestamp,
batch_size=batch_size, transaction=transaction)
self.assertEqual(batch._table, table)
self.assertEqual(batch._batch_size, batch_size)
self.assertEqual(batch._timestamp,
_datetime_from_microseconds(1000 * timestamp))
next_timestamp = _datetime_from_microseconds(1000 * (timestamp + 1))
time_range = TimestampRange(end=next_timestamp)
self.assertEqual(batch._delete_range, time_range)
self.assertEqual(batch._transaction, transaction)
self.assertEqual(batch._row_map, {})
self.assertEqual(batch._mutation_count, 0)
def test_constructor_with_non_default_wal(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import batch as MUT
warned = []
def mock_warn(msg):
warned.append(msg)
table = object()
wal = object()
with _Monkey(MUT, _WARN=mock_warn):
self._makeOne(table, wal=wal)
self.assertEqual(warned, [MUT._WAL_WARNING])
def test_constructor_with_non_positive_batch_size(self):
table = object()
batch_size = -10
with self.assertRaises(ValueError):
self._makeOne(table, batch_size=batch_size)
batch_size = 0
with self.assertRaises(ValueError):
self._makeOne(table, batch_size=batch_size)
def test_constructor_with_batch_size_and_transactional(self):
table = object()
batch_size = 1
transaction = True
with self.assertRaises(TypeError):
self._makeOne(table, batch_size=batch_size,
transaction=transaction)
def test_send(self):
table = object()
batch = self._makeOne(table)
batch._row_map = row_map = _MockRowMap()
row_map['row-key1'] = row1 = _MockRow()
row_map['row-key2'] = row2 = _MockRow()
batch._mutation_count = 1337
self.assertEqual(row_map.clear_count, 0)
self.assertEqual(row1.commits, 0)
self.assertEqual(row2.commits, 0)
self.assertNotEqual(batch._mutation_count, 0)
self.assertNotEqual(row_map, {})
batch.send()
self.assertEqual(row_map.clear_count, 1)
self.assertEqual(row1.commits, 1)
self.assertEqual(row2.commits, 1)
self.assertEqual(batch._mutation_count, 0)
self.assertEqual(row_map, {})
def test__try_send_no_batch_size(self):
klass = self._getTargetClass()
class BatchWithSend(_SendMixin, klass):
pass
table = object()
batch = BatchWithSend(table)
self.assertEqual(batch._batch_size, None)
self.assertFalse(batch._send_called)
batch._try_send()
self.assertFalse(batch._send_called)
def test__try_send_too_few_mutations(self):
klass = self._getTargetClass()
class BatchWithSend(_SendMixin, klass):
pass
table = object()
batch_size = 10
batch = BatchWithSend(table, batch_size=batch_size)
self.assertEqual(batch._batch_size, batch_size)
self.assertFalse(batch._send_called)
mutation_count = 2
batch._mutation_count = mutation_count
self.assertTrue(mutation_count < batch_size)
batch._try_send()
self.assertFalse(batch._send_called)
def test__try_send_actual_send(self):
klass = self._getTargetClass()
class BatchWithSend(_SendMixin, klass):
pass
table = object()
batch_size = 10
batch = BatchWithSend(table, batch_size=batch_size)
self.assertEqual(batch._batch_size, batch_size)
self.assertFalse(batch._send_called)
mutation_count = 12
batch._mutation_count = mutation_count
self.assertTrue(mutation_count > batch_size)
batch._try_send()
self.assertTrue(batch._send_called)
def test__get_row_exists(self):
table = object()
batch = self._makeOne(table)
row_key = 'row-key'
row_obj = object()
batch._row_map[row_key] = row_obj
result = batch._get_row(row_key)
self.assertEqual(result, row_obj)
def test__get_row_create_new(self):
# Make mock batch and make sure we can create a low-level table.
low_level_table = _MockLowLevelTable()
table = _MockTable(low_level_table)
batch = self._makeOne(table)
# Make sure row map is empty.
self.assertEqual(batch._row_map, {})
# Customize/capture mock table creation.
low_level_table.mock_row = mock_row = object()
# Actually get the row (which creates a row via a low-level table).
row_key = 'row-key'
result = batch._get_row(row_key)
self.assertEqual(result, mock_row)
# Check all the things that were constructed.
self.assertEqual(low_level_table.rows_made, [row_key])
# Check how the batch was updated.
self.assertEqual(batch._row_map, {row_key: mock_row})
def test_put_bad_wal(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import batch as MUT
warned = []
def mock_warn(message):
warned.append(message)
# Raise an exception so we don't have to mock the entire
# environment needed for put().
raise RuntimeError('No need to execute the rest.')
table = object()
batch = self._makeOne(table)
row = 'row-key'
data = {}
wal = None
self.assertNotEqual(wal, MUT._WAL_SENTINEL)
with _Monkey(MUT, _WARN=mock_warn):
with self.assertRaises(RuntimeError):
batch.put(row, data, wal=wal)
self.assertEqual(warned, [MUT._WAL_WARNING])
def test_put(self):
import operator
table = object()
batch = self._makeOne(table)
batch._timestamp = timestamp = object()
row_key = 'row-key'
batch._row_map[row_key] = row = _MockRow()
col1_fam = 'cf1'
col1_qual = 'qual1'
value1 = 'value1'
col2_fam = 'cf2'
col2_qual = 'qual2'
value2 = 'value2'
data = {col1_fam + ':' + col1_qual: value1,
col2_fam + ':' + col2_qual: value2}
self.assertEqual(batch._mutation_count, 0)
self.assertEqual(row.set_cell_calls, [])
batch.put(row_key, data)
self.assertEqual(batch._mutation_count, 2)
# Since the calls depend on data.keys(), the order
# is non-deterministic.
first_elt = operator.itemgetter(0)
ordered_calls = sorted(row.set_cell_calls, key=first_elt)
cell1_args = (col1_fam, col1_qual, value1)
cell1_kwargs = {'timestamp': timestamp}
cell2_args = (col2_fam, col2_qual, value2)
cell2_kwargs = {'timestamp': timestamp}
self.assertEqual(ordered_calls, [
(cell1_args, cell1_kwargs),
(cell2_args, cell2_kwargs),
])
def test_put_call_try_send(self):
klass = self._getTargetClass()
class CallTrySend(klass):
try_send_calls = 0
def _try_send(self):
self.try_send_calls += 1
table = object()
batch = CallTrySend(table)
row_key = 'row-key'
batch._row_map[row_key] = _MockRow()
self.assertEqual(batch._mutation_count, 0)
self.assertEqual(batch.try_send_calls, 0)
# No data so that nothing happens
batch.put(row_key, data={})
self.assertEqual(batch._mutation_count, 0)
self.assertEqual(batch.try_send_calls, 1)
def _delete_columns_test_helper(self, time_range=None):
table = object()
batch = self._makeOne(table)
batch._delete_range = time_range
col1_fam = 'cf1'
col2_fam = 'cf2'
col2_qual = 'col-name'
columns = [col1_fam + ':', col2_fam + ':' + col2_qual]
row_object = _MockRow()
batch._delete_columns(columns, row_object)
self.assertEqual(row_object.commits, 0)
cell_deleted_args = (col2_fam, col2_qual)
cell_deleted_kwargs = {'time_range': time_range}
self.assertEqual(row_object.delete_cell_calls,
[(cell_deleted_args, cell_deleted_kwargs)])
fam_deleted_args = (col1_fam,)
fam_deleted_kwargs = {'columns': row_object.ALL_COLUMNS}
self.assertEqual(row_object.delete_cells_calls,
[(fam_deleted_args, fam_deleted_kwargs)])
def test__delete_columns(self):
self._delete_columns_test_helper()
def test__delete_columns_w_time_and_col_fam(self):
time_range = object()
with self.assertRaises(ValueError):
self._delete_columns_test_helper(time_range=time_range)
def test_delete_bad_wal(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import batch as MUT
warned = []
def mock_warn(message):
warned.append(message)
# Raise an exception so we don't have to mock the entire
# environment needed for delete().
raise RuntimeError('No need to execute the rest.')
table = object()
batch = self._makeOne(table)
row = 'row-key'
columns = []
wal = None
self.assertNotEqual(wal, MUT._WAL_SENTINEL)
with _Monkey(MUT, _WARN=mock_warn):
with self.assertRaises(RuntimeError):
batch.delete(row, columns=columns, wal=wal)
self.assertEqual(warned, [MUT._WAL_WARNING])
def test_delete_entire_row(self):
table = object()
batch = self._makeOne(table)
row_key = 'row-key'
batch._row_map[row_key] = row = _MockRow()
self.assertEqual(row.deletes, 0)
self.assertEqual(batch._mutation_count, 0)
batch.delete(row_key, columns=None)
self.assertEqual(row.deletes, 1)
self.assertEqual(batch._mutation_count, 1)
def test_delete_entire_row_with_ts(self):
table = object()
batch = self._makeOne(table)
batch._delete_range = object()
row_key = 'row-key'
batch._row_map[row_key] = row = _MockRow()
self.assertEqual(row.deletes, 0)
self.assertEqual(batch._mutation_count, 0)
with self.assertRaises(ValueError):
batch.delete(row_key, columns=None)
self.assertEqual(row.deletes, 0)
self.assertEqual(batch._mutation_count, 0)
def test_delete_call_try_send(self):
klass = self._getTargetClass()
class CallTrySend(klass):
try_send_calls = 0
def _try_send(self):
self.try_send_calls += 1
table = object()
batch = CallTrySend(table)
row_key = 'row-key'
batch._row_map[row_key] = _MockRow()
self.assertEqual(batch._mutation_count, 0)
self.assertEqual(batch.try_send_calls, 0)
# No columns so that nothing happens
batch.delete(row_key, columns=[])
self.assertEqual(batch._mutation_count, 0)
self.assertEqual(batch.try_send_calls, 1)
def test_delete_some_columns(self):
table = object()
batch = self._makeOne(table)
row_key = 'row-key'
batch._row_map[row_key] = row = _MockRow()
self.assertEqual(batch._mutation_count, 0)
col1_fam = 'cf1'
col2_fam = 'cf2'
col2_qual = 'col-name'
columns = [col1_fam + ':', col2_fam + ':' + col2_qual]
batch.delete(row_key, columns=columns)
self.assertEqual(batch._mutation_count, 2)
cell_deleted_args = (col2_fam, col2_qual)
cell_deleted_kwargs = {'time_range': None}
self.assertEqual(row.delete_cell_calls,
[(cell_deleted_args, cell_deleted_kwargs)])
fam_deleted_args = (col1_fam,)
fam_deleted_kwargs = {'columns': row.ALL_COLUMNS}
self.assertEqual(row.delete_cells_calls,
[(fam_deleted_args, fam_deleted_kwargs)])
def test_context_manager(self):
klass = self._getTargetClass()
class BatchWithSend(_SendMixin, klass):
pass
table = object()
batch = BatchWithSend(table)
self.assertFalse(batch._send_called)
with batch:
pass
self.assertTrue(batch._send_called)
def test_context_manager_with_exception_non_transactional(self):
klass = self._getTargetClass()
class BatchWithSend(_SendMixin, klass):
pass
table = object()
batch = BatchWithSend(table)
self.assertFalse(batch._send_called)
with self.assertRaises(ValueError):
with batch:
raise ValueError('Something bad happened')
self.assertTrue(batch._send_called)
def test_context_manager_with_exception_transactional(self):
klass = self._getTargetClass()
class BatchWithSend(_SendMixin, klass):
pass
table = object()
batch = BatchWithSend(table, transaction=True)
self.assertFalse(batch._send_called)
with self.assertRaises(ValueError):
with batch:
raise ValueError('Something bad happened')
self.assertFalse(batch._send_called)
# Just to make sure send() actually works (and to make cover happy).
batch.send()
self.assertTrue(batch._send_called)
class Test__get_column_pairs(unittest2.TestCase):
def _callFUT(self, *args, **kwargs):
from gcloud.bigtable.happybase.batch import _get_column_pairs
return _get_column_pairs(*args, **kwargs)
def test_it(self):
columns = [b'cf1', u'cf2:', 'cf3::', 'cf3:name1', 'cf3:name2']
result = self._callFUT(columns)
expected_result = [
['cf1', None],
['cf2', None],
['cf3', ''],
['cf3', 'name1'],
['cf3', 'name2'],
]
self.assertEqual(result, expected_result)
def test_bad_column(self):
columns = ['a:b:c']
with self.assertRaises(ValueError):
self._callFUT(columns)
def test_bad_column_type(self):
columns = [None]
with self.assertRaises(AttributeError):
self._callFUT(columns)
def test_bad_columns_var(self):
columns = None
with self.assertRaises(TypeError):
self._callFUT(columns)
def test_column_family_with_require_qualifier(self):
columns = ['a:']
with self.assertRaises(ValueError):
self._callFUT(columns, require_qualifier=True)
class _MockRowMap(dict):
clear_count = 0
def clear(self):
self.clear_count += 1
super(_MockRowMap, self).clear()
class _MockRow(object):
ALL_COLUMNS = object()
def __init__(self):
self.commits = 0
self.deletes = 0
self.set_cell_calls = []
self.delete_cell_calls = []
self.delete_cells_calls = []
def commit(self):
self.commits += 1
def delete(self):
self.deletes += 1
def set_cell(self, *args, **kwargs):
self.set_cell_calls.append((args, kwargs))
def delete_cell(self, *args, **kwargs):
self.delete_cell_calls.append((args, kwargs))
def delete_cells(self, *args, **kwargs):
self.delete_cells_calls.append((args, kwargs))
class _MockTable(object):
def __init__(self, low_level_table):
self._low_level_table = low_level_table
class _MockLowLevelTable(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.rows_made = []
self.mock_row = None
def row(self, row_key):
self.rows_made.append(row_key)
return self.mock_row

View file

@ -0,0 +1,682 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import unittest2
class Test__get_instance(unittest2.TestCase):
def _callFUT(self, timeout=None):
from gcloud.bigtable.happybase.connection import _get_instance
return _get_instance(timeout=timeout)
def _helper(self, timeout=None, instances=(), failed_locations=()):
from functools import partial
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
client_with_instances = partial(
_Client, instances=instances, failed_locations=failed_locations)
with _Monkey(MUT, Client=client_with_instances):
result = self._callFUT(timeout=timeout)
# If we've reached this point, then _callFUT didn't fail, so we know
# there is exactly one instance.
instance, = instances
self.assertEqual(result, instance)
client = instance.client
self.assertEqual(client.args, ())
expected_kwargs = {'admin': True}
if timeout is not None:
expected_kwargs['timeout_seconds'] = timeout / 1000.0
self.assertEqual(client.kwargs, expected_kwargs)
self.assertEqual(client.start_calls, 1)
self.assertEqual(client.stop_calls, 1)
def test_default(self):
instance = _Instance()
self._helper(instances=[instance])
def test_with_timeout(self):
instance = _Instance()
self._helper(timeout=2103, instances=[instance])
def test_with_no_instances(self):
with self.assertRaises(ValueError):
self._helper()
def test_with_too_many_instances(self):
instances = [_Instance(), _Instance()]
with self.assertRaises(ValueError):
self._helper(instances=instances)
def test_with_failed_locations(self):
instance = _Instance()
failed_location = 'us-central1-c'
with self.assertRaises(ValueError):
self._helper(instances=[instance],
failed_locations=[failed_location])
class TestConnection(unittest2.TestCase):
def _getTargetClass(self):
from gcloud.bigtable.happybase.connection import Connection
return Connection
def _makeOne(self, *args, **kwargs):
return self._getTargetClass()(*args, **kwargs)
def test_constructor_defaults(self):
instance = _Instance() # Avoid implicit environ check.
self.assertEqual(instance._client.start_calls, 0)
connection = self._makeOne(instance=instance)
self.assertEqual(instance._client.start_calls, 1)
self.assertEqual(instance._client.stop_calls, 0)
self.assertEqual(connection._instance, instance)
self.assertEqual(connection.table_prefix, None)
self.assertEqual(connection.table_prefix_separator, '_')
def test_constructor_no_autoconnect(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
self.assertEqual(instance._client.start_calls, 0)
self.assertEqual(instance._client.stop_calls, 0)
self.assertEqual(connection.table_prefix, None)
self.assertEqual(connection.table_prefix_separator, '_')
def test_constructor_missing_instance(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
instance = _Instance()
timeout = object()
get_instance_called = []
def mock_get_instance(timeout):
get_instance_called.append(timeout)
return instance
with _Monkey(MUT, _get_instance=mock_get_instance):
connection = self._makeOne(autoconnect=False, instance=None,
timeout=timeout)
self.assertEqual(connection.table_prefix, None)
self.assertEqual(connection.table_prefix_separator, '_')
self.assertEqual(connection._instance, instance)
self.assertEqual(get_instance_called, [timeout])
def test_constructor_explicit(self):
autoconnect = False
table_prefix = 'table-prefix'
table_prefix_separator = 'sep'
instance_copy = _Instance()
instance = _Instance(copies=[instance_copy])
connection = self._makeOne(
autoconnect=autoconnect,
table_prefix=table_prefix,
table_prefix_separator=table_prefix_separator,
instance=instance)
self.assertEqual(connection.table_prefix, table_prefix)
self.assertEqual(connection.table_prefix_separator,
table_prefix_separator)
def test_constructor_with_unknown_argument(self):
instance = _Instance()
with self.assertRaises(TypeError):
self._makeOne(instance=instance, unknown='foo')
def test_constructor_with_legacy_args(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
warned = []
def mock_warn(msg):
warned.append(msg)
instance = _Instance()
with _Monkey(MUT, _WARN=mock_warn):
self._makeOne(instance=instance, host=object(),
port=object(), compat=object(),
transport=object(), protocol=object())
self.assertEqual(len(warned), 1)
self.assertIn('host', warned[0])
self.assertIn('port', warned[0])
self.assertIn('compat', warned[0])
self.assertIn('transport', warned[0])
self.assertIn('protocol', warned[0])
def test_constructor_with_timeout_and_instance(self):
instance = _Instance()
with self.assertRaises(ValueError):
self._makeOne(instance=instance, timeout=object())
def test_constructor_non_string_prefix(self):
table_prefix = object()
with self.assertRaises(TypeError):
self._makeOne(autoconnect=False,
table_prefix=table_prefix)
def test_constructor_non_string_prefix_separator(self):
table_prefix_separator = object()
with self.assertRaises(TypeError):
self._makeOne(autoconnect=False,
table_prefix_separator=table_prefix_separator)
def test_open(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
self.assertEqual(instance._client.start_calls, 0)
connection.open()
self.assertEqual(instance._client.start_calls, 1)
self.assertEqual(instance._client.stop_calls, 0)
def test_close(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
self.assertEqual(instance._client.stop_calls, 0)
connection.close()
self.assertEqual(instance._client.stop_calls, 1)
self.assertEqual(instance._client.start_calls, 0)
def test___del__with_instance(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
self.assertEqual(instance._client.stop_calls, 0)
connection.__del__()
self.assertEqual(instance._client.stop_calls, 1)
def test___del__no_instance(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
self.assertEqual(instance._client.stop_calls, 0)
del connection._instance
connection.__del__()
self.assertEqual(instance._client.stop_calls, 0)
def test__table_name_with_prefix_set(self):
table_prefix = 'table-prefix'
table_prefix_separator = '<>'
instance = _Instance()
connection = self._makeOne(
autoconnect=False,
table_prefix=table_prefix,
table_prefix_separator=table_prefix_separator,
instance=instance)
name = 'some-name'
prefixed = connection._table_name(name)
self.assertEqual(prefixed,
table_prefix + table_prefix_separator + name)
def test__table_name_with_no_prefix_set(self):
instance = _Instance()
connection = self._makeOne(autoconnect=False,
instance=instance)
name = 'some-name'
prefixed = connection._table_name(name)
self.assertEqual(prefixed, name)
def test_table_factory(self):
from gcloud.bigtable.happybase.table import Table
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
table = connection.table(name)
self.assertTrue(isinstance(table, Table))
self.assertEqual(table.name, name)
self.assertEqual(table.connection, connection)
def _table_factory_prefix_helper(self, use_prefix=True):
from gcloud.bigtable.happybase.table import Table
instance = _Instance() # Avoid implicit environ check.
table_prefix = 'table-prefix'
table_prefix_separator = '<>'
connection = self._makeOne(
autoconnect=False, table_prefix=table_prefix,
table_prefix_separator=table_prefix_separator,
instance=instance)
name = 'table-name'
table = connection.table(name, use_prefix=use_prefix)
self.assertTrue(isinstance(table, Table))
prefixed_name = table_prefix + table_prefix_separator + name
if use_prefix:
self.assertEqual(table.name, prefixed_name)
else:
self.assertEqual(table.name, name)
self.assertEqual(table.connection, connection)
def test_table_factory_with_prefix(self):
self._table_factory_prefix_helper(use_prefix=True)
def test_table_factory_with_ignored_prefix(self):
self._table_factory_prefix_helper(use_prefix=False)
def test_tables(self):
from gcloud.bigtable.table import Table
table_name1 = 'table-name1'
table_name2 = 'table-name2'
instance = _Instance(list_tables_result=[
Table(table_name1, None),
Table(table_name2, None),
])
connection = self._makeOne(autoconnect=False, instance=instance)
result = connection.tables()
self.assertEqual(result, [table_name1, table_name2])
def test_tables_with_prefix(self):
from gcloud.bigtable.table import Table
table_prefix = 'prefix'
table_prefix_separator = '<>'
unprefixed_table_name1 = 'table-name1'
table_name1 = (table_prefix + table_prefix_separator +
unprefixed_table_name1)
table_name2 = 'table-name2'
instance = _Instance(list_tables_result=[
Table(table_name1, None),
Table(table_name2, None),
])
connection = self._makeOne(
autoconnect=False, instance=instance, table_prefix=table_prefix,
table_prefix_separator=table_prefix_separator)
result = connection.tables()
self.assertEqual(result, [unprefixed_table_name1])
def test_create_table(self):
import operator
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
mock_gc_rule = object()
called_options = []
def mock_parse_family_option(option):
called_options.append(option)
return mock_gc_rule
name = 'table-name'
col_fam1 = 'cf1'
col_fam_option1 = object()
col_fam2 = u'cf2'
col_fam_option2 = object()
col_fam3 = b'cf3'
col_fam_option3 = object()
families = {
col_fam1: col_fam_option1,
# A trailing colon is also allowed.
col_fam2 + ':': col_fam_option2,
col_fam3 + b':': col_fam_option3,
}
tables_created = []
def make_table(*args, **kwargs):
result = _MockLowLevelTable(*args, **kwargs)
tables_created.append(result)
return result
with _Monkey(MUT, _LowLevelTable=make_table,
_parse_family_option=mock_parse_family_option):
connection.create_table(name, families)
# Just one table would have been created.
table_instance, = tables_created
self.assertEqual(table_instance.args, (name, instance))
self.assertEqual(table_instance.kwargs, {})
self.assertEqual(table_instance.create_calls, 1)
# Check if our mock was called twice, but we don't know the order.
self.assertEqual(
set(called_options),
set([col_fam_option1, col_fam_option2, col_fam_option3]))
# We expect three column family instances created, but don't know the
# order due to non-deterministic dict.items().
col_fam_created = table_instance.col_fam_created
self.assertEqual(len(col_fam_created), 3)
col_fam_created.sort(key=operator.attrgetter('column_family_id'))
self.assertEqual(col_fam_created[0].column_family_id, col_fam1)
self.assertEqual(col_fam_created[0].gc_rule, mock_gc_rule)
self.assertEqual(col_fam_created[0].create_calls, 1)
self.assertEqual(col_fam_created[1].column_family_id, col_fam2)
self.assertEqual(col_fam_created[1].gc_rule, mock_gc_rule)
self.assertEqual(col_fam_created[1].create_calls, 1)
self.assertEqual(col_fam_created[2].column_family_id,
col_fam3.decode('utf-8'))
self.assertEqual(col_fam_created[2].gc_rule, mock_gc_rule)
self.assertEqual(col_fam_created[2].create_calls, 1)
def test_create_table_bad_type(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
families = None
with self.assertRaises(TypeError):
connection.create_table(name, families)
def test_create_table_bad_value(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
families = {}
with self.assertRaises(ValueError):
connection.create_table(name, families)
def _create_table_error_helper(self, err_val, err_type):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
tables_created = []
def make_table(*args, **kwargs):
kwargs['create_error'] = err_val
result = _MockLowLevelTable(*args, **kwargs)
tables_created.append(result)
return result
name = 'table-name'
families = {'foo': {}}
with _Monkey(MUT, _LowLevelTable=make_table):
with self.assertRaises(err_type):
connection.create_table(name, families)
self.assertEqual(len(tables_created), 1)
self.assertEqual(tables_created[0].create_calls, 1)
@unittest2.skipUnless(sys.version_info[:2] == (2, 7),
'gRPC only in Python 2.7')
def test_create_table_already_exists(self):
from grpc.beta import interfaces
from grpc.framework.interfaces.face import face
from gcloud.bigtable.happybase.connection import AlreadyExists
err_val = face.NetworkError(None, None,
interfaces.StatusCode.ALREADY_EXISTS, None)
self._create_table_error_helper(err_val, AlreadyExists)
@unittest2.skipUnless(sys.version_info[:2] == (2, 7),
'gRPC only in Python 2.7')
def test_create_table_connection_error(self):
from grpc.beta import interfaces
from grpc.framework.interfaces.face import face
err_val = face.NetworkError(None, None,
interfaces.StatusCode.INTERNAL, None)
self._create_table_error_helper(err_val, face.NetworkError)
@unittest2.skipUnless(sys.version_info[:2] == (2, 7),
'gRPC only in Python 2.7')
def test_create_table_other_error(self):
self._create_table_error_helper(RuntimeError, RuntimeError)
def _delete_table_helper(self, disable=False):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
tables_created = []
def make_table(*args, **kwargs):
result = _MockLowLevelTable(*args, **kwargs)
tables_created.append(result)
return result
name = 'table-name'
with _Monkey(MUT, _LowLevelTable=make_table):
connection.delete_table(name, disable=disable)
# Just one table would have been created.
table_instance, = tables_created
self.assertEqual(table_instance.args, (name, instance))
self.assertEqual(table_instance.kwargs, {})
self.assertEqual(table_instance.delete_calls, 1)
def test_delete_table(self):
self._delete_table_helper()
def test_delete_table_disable(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
warned = []
def mock_warn(msg):
warned.append(msg)
with _Monkey(MUT, _WARN=mock_warn):
self._delete_table_helper(disable=True)
self.assertEqual(warned, [MUT._DISABLE_DELETE_MSG])
def test_enable_table(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
with self.assertRaises(NotImplementedError):
connection.enable_table(name)
def test_disable_table(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
with self.assertRaises(NotImplementedError):
connection.disable_table(name)
def test_is_table_enabled(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
with self.assertRaises(NotImplementedError):
connection.is_table_enabled(name)
def test_compact_table(self):
instance = _Instance() # Avoid implicit environ check.
connection = self._makeOne(autoconnect=False, instance=instance)
name = 'table-name'
major = True
with self.assertRaises(NotImplementedError):
connection.compact_table(name, major=major)
class Test__parse_family_option(unittest2.TestCase):
def _callFUT(self, option):
from gcloud.bigtable.happybase.connection import _parse_family_option
return _parse_family_option(option)
def test_dictionary_no_keys(self):
option = {}
result = self._callFUT(option)
self.assertEqual(result, None)
def test_null(self):
option = None
result = self._callFUT(option)
self.assertEqual(result, None)
def test_dictionary_bad_key(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import connection as MUT
warned = []
def mock_warn(msg):
warned.append(msg)
option = {'badkey': None}
with _Monkey(MUT, _WARN=mock_warn):
result = self._callFUT(option)
self.assertEqual(result, None)
self.assertEqual(len(warned), 1)
self.assertIn('badkey', warned[0])
def test_dictionary_versions_key(self):
from gcloud.bigtable.column_family import MaxVersionsGCRule
versions = 42
option = {'max_versions': versions}
result = self._callFUT(option)
gc_rule = MaxVersionsGCRule(versions)
self.assertEqual(result, gc_rule)
def test_dictionary_ttl_key(self):
import datetime
from gcloud.bigtable.column_family import MaxAgeGCRule
time_to_live = 24 * 60 * 60
max_age = datetime.timedelta(days=1)
option = {'time_to_live': time_to_live}
result = self._callFUT(option)
gc_rule = MaxAgeGCRule(max_age)
self.assertEqual(result, gc_rule)
def test_dictionary_both_keys(self):
import datetime
from gcloud.bigtable.column_family import GCRuleIntersection
from gcloud.bigtable.column_family import MaxAgeGCRule
from gcloud.bigtable.column_family import MaxVersionsGCRule
versions = 42
time_to_live = 24 * 60 * 60
option = {
'max_versions': versions,
'time_to_live': time_to_live,
}
result = self._callFUT(option)
max_age = datetime.timedelta(days=1)
# NOTE: This relies on the order of the rules in the method we are
# calling matching this order here.
gc_rule1 = MaxAgeGCRule(max_age)
gc_rule2 = MaxVersionsGCRule(versions)
gc_rule = GCRuleIntersection(rules=[gc_rule1, gc_rule2])
self.assertEqual(result, gc_rule)
def test_non_dictionary(self):
option = object()
self.assertFalse(isinstance(option, dict))
result = self._callFUT(option)
self.assertEqual(result, option)
class _Client(object):
def __init__(self, *args, **kwargs):
self.instances = kwargs.pop('instances', [])
for instance in self.instances:
instance.client = self
self.failed_locations = kwargs.pop('failed_locations', [])
self.args = args
self.kwargs = kwargs
self.start_calls = 0
self.stop_calls = 0
def start(self):
self.start_calls += 1
def stop(self):
self.stop_calls += 1
def list_instances(self):
return self.instances, self.failed_locations
class _Instance(object):
def __init__(self, copies=(), list_tables_result=()):
self.copies = list(copies)
# Included to support Connection.__del__
self._client = _Client()
self.list_tables_result = list_tables_result
def copy(self):
if self.copies:
result = self.copies[0]
self.copies[:] = self.copies[1:]
return result
else:
return self
def list_tables(self):
return self.list_tables_result
class _MockLowLevelColumnFamily(object):
def __init__(self, column_family_id, gc_rule=None):
self.column_family_id = column_family_id
self.gc_rule = gc_rule
self.create_calls = 0
def create(self):
self.create_calls += 1
class _MockLowLevelTable(object):
def __init__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.create_error = kwargs.get('create_error')
self.delete_calls = 0
self.create_calls = 0
self.col_fam_created = []
def delete(self):
self.delete_calls += 1
def create(self):
self.create_calls += 1
if self.create_error:
raise self.create_error
def column_family(self, column_family_id, gc_rule=None):
result = _MockLowLevelColumnFamily(column_family_id, gc_rule=gc_rule)
self.col_fam_created.append(result)
return result

View file

@ -0,0 +1,264 @@
# Copyright 2016 Google Inc. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import unittest2
class TestConnectionPool(unittest2.TestCase):
def _getTargetClass(self):
from gcloud.bigtable.happybase.pool import ConnectionPool
return ConnectionPool
def _makeOne(self, *args, **kwargs):
return self._getTargetClass()(*args, **kwargs)
def test_constructor_defaults(self):
import six
import threading
from gcloud.bigtable.happybase.connection import Connection
size = 11
instance_copy = _Instance()
all_copies = [instance_copy] * size
instance = _Instance(all_copies) # Avoid implicit environ check.
pool = self._makeOne(size, instance=instance)
self.assertTrue(isinstance(pool._lock, type(threading.Lock())))
self.assertTrue(isinstance(pool._thread_connections, threading.local))
self.assertEqual(pool._thread_connections.__dict__, {})
queue = pool._queue
self.assertTrue(isinstance(queue, six.moves.queue.LifoQueue))
self.assertTrue(queue.full())
self.assertEqual(queue.maxsize, size)
for connection in queue.queue:
self.assertTrue(isinstance(connection, Connection))
self.assertTrue(connection._instance is instance_copy)
def test_constructor_passes_kwargs(self):
table_prefix = 'foo'
table_prefix_separator = '<>'
instance = _Instance() # Avoid implicit environ check.
size = 1
pool = self._makeOne(size, table_prefix=table_prefix,
table_prefix_separator=table_prefix_separator,
instance=instance)
for connection in pool._queue.queue:
self.assertEqual(connection.table_prefix, table_prefix)
self.assertEqual(connection.table_prefix_separator,
table_prefix_separator)
def test_constructor_ignores_autoconnect(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase.connection import Connection
from gcloud.bigtable.happybase import pool as MUT
class ConnectionWithOpen(Connection):
_open_called = False
def open(self):
self._open_called = True
# First make sure the custom Connection class does as expected.
instance_copy1 = _Instance()
instance_copy2 = _Instance()
instance_copy3 = _Instance()
instance = _Instance([instance_copy1, instance_copy2, instance_copy3])
connection = ConnectionWithOpen(autoconnect=False, instance=instance)
self.assertFalse(connection._open_called)
self.assertTrue(connection._instance is instance_copy1)
connection = ConnectionWithOpen(autoconnect=True, instance=instance)
self.assertTrue(connection._open_called)
self.assertTrue(connection._instance is instance_copy2)
# Then make sure autoconnect=True is ignored in a pool.
size = 1
with _Monkey(MUT, Connection=ConnectionWithOpen):
pool = self._makeOne(size, autoconnect=True, instance=instance)
for connection in pool._queue.queue:
self.assertTrue(isinstance(connection, ConnectionWithOpen))
self.assertTrue(connection._instance is instance_copy3)
self.assertFalse(connection._open_called)
def test_constructor_infers_instance(self):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase.connection import Connection
from gcloud.bigtable.happybase import pool as MUT
size = 1
instance_copy = _Instance()
all_copies = [instance_copy] * size
instance = _Instance(all_copies)
get_instance_calls = []
def mock_get_instance(timeout=None):
get_instance_calls.append(timeout)
return instance
with _Monkey(MUT, _get_instance=mock_get_instance):
pool = self._makeOne(size)
for connection in pool._queue.queue:
self.assertTrue(isinstance(connection, Connection))
# We know that the Connection() constructor will
# call instance.copy().
self.assertTrue(connection._instance is instance_copy)
self.assertEqual(get_instance_calls, [None])
def test_constructor_non_integer_size(self):
size = None
with self.assertRaises(TypeError):
self._makeOne(size)
def test_constructor_non_positive_size(self):
size = -10
with self.assertRaises(ValueError):
self._makeOne(size)
size = 0
with self.assertRaises(ValueError):
self._makeOne(size)
def _makeOneWithMockQueue(self, queue_return):
from gcloud._testing import _Monkey
from gcloud.bigtable.happybase import pool as MUT
# We are going to use a fake queue, so we don't want any connections
# or instances to be created in the constructor.
size = -1
instance = object()
with _Monkey(MUT, _MIN_POOL_SIZE=size):
pool = self._makeOne(size, instance=instance)
pool._queue = _Queue(queue_return)
return pool
def test__acquire_connection(self):
queue_return = object()
pool = self._makeOneWithMockQueue(queue_return)
timeout = 432
connection = pool._acquire_connection(timeout=timeout)
self.assertTrue(connection is queue_return)
self.assertEqual(pool._queue._get_calls, [(True, timeout)])
self.assertEqual(pool._queue._put_calls, [])
def test__acquire_connection_failure(self):
from gcloud.bigtable.happybase.pool import NoConnectionsAvailable
pool = self._makeOneWithMockQueue(None)
timeout = 1027
with self.assertRaises(NoConnectionsAvailable):
pool._acquire_connection(timeout=timeout)
self.assertEqual(pool._queue._get_calls, [(True, timeout)])
self.assertEqual(pool._queue._put_calls, [])
def test_connection_is_context_manager(self):
import contextlib
import six
queue_return = _Connection()
pool = self._makeOneWithMockQueue(queue_return)
cnxn_context = pool.connection()
if six.PY3: # pragma: NO COVER Python 3
self.assertTrue(isinstance(cnxn_context,
contextlib._GeneratorContextManager))
else:
self.assertTrue(isinstance(cnxn_context,
contextlib.GeneratorContextManager))
def test_connection_no_current_cnxn(self):
queue_return = _Connection()
pool = self._makeOneWithMockQueue(queue_return)
timeout = 55
self.assertFalse(hasattr(pool._thread_connections, 'current'))
with pool.connection(timeout=timeout) as connection:
self.assertEqual(pool._thread_connections.current, queue_return)
self.assertTrue(connection is queue_return)
self.assertFalse(hasattr(pool._thread_connections, 'current'))
self.assertEqual(pool._queue._get_calls, [(True, timeout)])
self.assertEqual(pool._queue._put_calls,
[(queue_return, None, None)])
def test_connection_with_current_cnxn(self):
current_cnxn = _Connection()
queue_return = _Connection()
pool = self._makeOneWithMockQueue(queue_return)
pool._thread_connections.current = current_cnxn
timeout = 8001
with pool.connection(timeout=timeout) as connection:
self.assertTrue(connection is current_cnxn)
self.assertEqual(pool._queue._get_calls, [])
self.assertEqual(pool._queue._put_calls, [])
self.assertEqual(pool._thread_connections.current, current_cnxn)
class _Client(object):
def __init__(self):
self.stop_calls = 0
def stop(self):
self.stop_calls += 1
class _Connection(object):
def open(self):
pass
class _Instance(object):
def __init__(self, copies=()):
self.copies = list(copies)
# Included to support Connection.__del__
self._client = _Client()
def copy(self):
if self.copies:
result = self.copies[0]
self.copies[:] = self.copies[1:]
return result
else:
return self
class _Queue(object):
def __init__(self, result=None):
self.result = result
self._get_calls = []
self._put_calls = []
def get(self, block=None, timeout=None):
self._get_calls.append((block, timeout))
if self.result is None:
import six
raise six.moves.queue.Empty
else:
return self.result
def put(self, item, block=None, timeout=None):
self._put_calls.append((item, block, timeout))

File diff suppressed because it is too large Load diff