RestPose modules

Client

The RestPose client mirrors the resources provided by the RestPose server as Python objects.

class restpose.client.Server(uri='http://127.0.0.1:7777', resource_class=None, resource_instance=None, **client_opts)[source]

Representation of a RestPose server.

Allows indexing, searching, status management, etc.

Parameters:
  • uri – Full URI to the top path of the server.
  • resource_class – If specified, defines a resource class to use instead of the default class. This should usually be a subclass of RestPoseResource.
  • resource_instance – If specified, defines a resource instance to use instead of making one with the default class (or the class specified by resource_class.
  • client_opts – Parameters to use to update the existing client_opts in the resource (if resource_instance is specified), or to use when creating the resource (if resource_class is specified).
status[source]

Get server status.

Returns a dictionary holding the status as returned from the server. See the server documentation for details.

collections[source]

Get a list of existing collections.

Returns a list of collection names (as strings).

collection(coll_name)[source]

Access to a collection.

Parameters:coll_name – The name of the collection to access.
Returns:a Collection object which can be used to search and modify the contents of the Collection.

Note

No request is performed directly by this method; a Collection object is simply created which will make requests when needed. For this reason, no error will be reported at this stage even if the collection does not exist, or if a collection name containing invalid characters is used.

class restpose.client.FieldQueryFactory(target=None)[source]

Object for creating searches on a field.

Parameters:target – The target to pass to the Query objects created.
target

The target that will be used when creating Query objects. Defaults to None.

class restpose.client.FieldQuerySource(fieldname, target=None)[source]

An object which generates queries for a specific field.

Parameters:
  • fieldname – The name of the field to generate queries for. If set to None, will generate queries across all fields.
  • target – The target to generate queries pointing to.
is_in(values)[source]

Create a query for fields which exactly match the given values.

A document will match if at least one of the stored values for the field exactly matches at least one of the given values.

This query type is currently available only for “exact”, “id” and “cat” field types.

Parameters:

value – A container holding the values to search for. As a special case, if a string is supplied, this is equivalent to supplying a container holding that string.

Example :

Search for documents in which the “tag” field has a value of “edam”, “cheddar” or “leicester”.

>>> query = coll.field.tag.is_in(['edam', 'cheddar', 'leicester'])

Search for documents in which the “tag” field has a value of “edam”.

>>> query = coll.field.tag.is_in('edam')
is_descendant(categories)[source]

Create a query for field values which are categories which are descendants of one of the given categories.

A document will match if at least one of the stored values for the field exactly matches a descendant of the given categories.

This query type is available only for “cat” field types.

Parameters:

categories – A container holding the categories to search for. As a special case, if a string is supplied, this is equivalent to supplying a container holding that string.

Example :

Search for documents in which the “tag” field is a descendant of a value of “cheese”

>>> query = coll.field.tag.is_descendant('cheese')

or, equivalently:

>>> query = coll.field.tag.is_descendant(['cheese'])
is_or_is_descendant(categories)[source]

Create a query for field values which are categories which are descendants of one of the given categories.

A document will match if at least one of the stored values for the field exactly matches a descendant of the given categories.

This query type is available only for “cat” field types.

Parameters:

categories – A container holding the categories to search for. As a special case, if a string is supplied, this is equivalent to supplying a container holding that string.

Example :

Search for documents in which the “tag” field has a value of “cheese”, or has a value which is a descendant of “cheese”.

>>> query = coll.field.tag.is_or_is_descendant('cheese')

or, equivalently:

>>> query = coll.field.tag.is_or_is_descendant(['cheese'])
equals(value)

Create a query for fields which exactly match the given value.

Matches documents in which the supplied value exactly matches the stored value.

This query type is currently available only for “exact”, “id” and “cat” field types.

This query type may be constructed using the == operator, or the equals method.

Parameters:

value – The value to search for.

Example :

Search for documents in which the “tag” field has a value of “edam”.

>>> query = coll.field.tag.equals('edam')

Or, equivalently (but less conveniently for chained calls)

>>> query = (coll.field.tag == 'edam')
range(begin, end)[source]

Create a query for field values in a given range.

Matches documents in which one of the stored values in the field are in the specified range, including both the begin and end values.

This type is currently available only for “double”, “date” and “timestamp” field types.

Parameters:
  • begin – The start of the range.
  • end – The end of the range.
Example :

Search for documents in which the “num” field has a value in the range 0 to 10 (including the endpoints).

>>> query = coll.field.num.range(0, 10)
text(text, op='phrase', window=None)[source]

Create a query for a piece of text in the field.

This is a simple search for a matching sequences of words (subject to whatever processing has been performed on the field to conflate variant forms of words, such as stemming or word splitting for CJK text).

Parameters:
  • text – The text to search for. If empty, this query will match no results.
  • op – The operator to use when searching. One of “or”, “and”, “phrase” (ordered proximity), “near” (unordered proximity). Default=”phrase”.
  • window – Only relevant if op is “phrase” or “near”. Window size in words within which the words in the text need to occur for a document to match; None=length of text. Integer or None. Default=None
Example :

Search for documents in which the “text” field contains text matching the phrase “Hello world”.

>>> query = coll.field.text.text("Hello world")
parse(text, op='and')[source]

Parse a structured query, searching the field.

Unlike text, this allows various operators to be used in the query; for example, parentheses may be used, and operators such as “AND” may be used

Todo

Document the operators permitted.

Beware that the parser is unable to make sense of some query strings (eg, those with mismatched parentheses). If such a query string is used, an error will be returned by the server when the search is performed.

Parameters:
  • fieldname – The field to search within.
  • text – Text to search for. If empty, this query will match no results.
  • op – The default operator to use when searching. One of “or”, “and”. Default=”and”.
Example :

Search for documents in which the “text” field contains both “Hello” and “world”, but not “big”.

>>> query = coll.field.text.text("Hello world -big")
exists()[source]

Search for documents in which the field exists.

This type may be used to search across all fields.

Example :

Search for documents in which the “text” field exists.

>>> query = coll.field.text.exists()

Search for documents in which any field exists.

>>> query = coll.any_field.exists()
nonempty()[source]

Search for documents in which the field has a non-empty value.

This type may be used to search across all fields.

Example :

Search for documents in which the “text” field has a non-empty value.

>>> query = coll.field.text.nonempty()

Search for documents in which any field has a non-empty value.

>>> query = coll.any_field.nonempty()
empty()[source]

Search for documents in which the field has an empty value.

This type may be used to search across all fields.

Example :

Search for documents in which the “text” field has an empty value.

>>> query = coll.field.text.empty()

Search for documents in which any field has an empty value.

>>> query = coll.any_field.empty()
has_error()[source]

Search for documents in which the field produced errors when parsing.

This type may be used to search across all fields.

Example :

Search for documents in which the “text” field had an error when parsing.

>>> query = coll.field.text.has_error()

Search for documents in which any field had an error when parsing.

>>> query = coll.any_field.has_error()
class restpose.client.QueryTarget[source]

An object which can be used to make and run queries.

field

Factory for field-specific queries.

any_field

Pseudo field for making queries across all fields.

all()[source]

Create a query which matches all documents.

none()[source]

Create a query which matches no documents.

find(q)[source]

Apply a Query to this QueryTarget.

Parameters:q – A Query object which will have the target applied to it.
search(search)[source]

Perform a search.

Parameters:search – is a search structure to be sent to the server, or a Search or Query object.
class restpose.client.Document(collection, doc_type, doc_id)[source]
data[source]
terms[source]
values[source]
class restpose.client.DocumentType(collection, doc_type)[source]
add_doc(doc, doc_id=None)[source]

Add a document to the collection.

delete_doc(doc_id)[source]

Delete a document with this type from the collection.

get_doc(doc_id)[source]
class restpose.client.Collection(server, coll_name)[source]
doc_type(doc_type)[source]
status[source]

The status of the collection.

config[source]

The configuration of the collection.

add_doc(doc, doc_type=None, doc_id=None)[source]

Add a document to the collection.

delete_doc(doc_type, doc_id)[source]

Delete a document from the collection.

get_doc(doc_type, doc_id)[source]

Get a document from the collection.

checkpoint(commit=True)[source]

Set a checkpoint on the collection.

This creates a resource on the server which can be queried to detect whether indexing has reached the checkpoint yet. All updates sent before the checkpoint will be processed before indexing reaches the checkpoint, and no updates sent after the checkpoint will be processed before indexing reaches the checkpoint.

taxonomies()[source]

Get a list of the taxonomy names.

taxonomy(taxonomy_name)[source]

Access a taxonomy, for getting and setting its hierarchy.

delete()[source]

Delete the entire collection.

class restpose.client.CheckPoint(collection, check_id)[source]

A checkpoint, used to check the progress of indexing.

check_id[source]

The ID of the checkpoint.

This is used to identify the checkpoint on the server.

reached[source]

Return true if the checkpoint has been reached.

May contact the server to check the current state.

Raises CheckPointExpiredError if the checkpoint expired before the state was checked.

errors[source]

Return the list of errors associated with the CheckPoint.

Note that if there are many errors, only the first few will be returned.

Returns None if the checkpoint hasn’t been reached yet.

Raises CheckPointExpiredError if the checkpoint expired before the state was checked.

total_errors[source]

Return the total count of errors associated with the CheckPoint.

This may be larger than len(self.errors), if there were more errors than the CheckPoint is able to hold.

Returns None if the checkpoint hasn’t been reached yet.

Raises CheckPointExpiredError if the checkpoint expired before the state was checked.

wait()[source]

Wait for the checkpoint to be reached.

This will contact the server, and wait until the checkpoint has been reached.

If the checkpoint expires (before or during the call), a CheckPointExpiredError will be raised. Otherwise, this will return the checkpoint, so that further methods can be chained on it.

class restpose.client.Taxonomy(collection, taxonomy_name)[source]

A taxnonmy; a hierarchy of category relationships.

A collection may have many taxonomies, each identified by a name. Each taxonomy contains a set of categories, and a tree of parent-child relationships (or, to use the correct mathematical terminology, a forest. ie, there may be many disjoint trees of parent-child relationships).

This class allows the relationships in a taxonomy to be obtained and modified.

all()[source]

Get details about the entire set of categories in the taxonomy.

This returns a dict, keyed by category ID, in which each each value is a list of parent category IDs.

Raises ResourceNotFound if the collection or taxonomy are not found.

add_parent(category, parent)[source]

Add a parent to a category.

Creates the collection, taxononmy, category and the parent, if necessary.

remove_parent(category, parent)[source]

Remove a parent from a category.

Creates the collection and taxononmy if they don’t already exist.

remove_category(category)[source]

Remove a category.

Creates the collection and taxononmy if they don’t already exist.

remove()[source]

Remove this entire taxonomy.

Query

Queries in RestPose.

class restpose.query.Searchable(target)[source]

An object which can be sliced or iterated to perform a query.

Create a new Searchable.

target is the object that the search will be performed on. For example, a restpose.Collection or restpose.DocumentType object.

page_size

Number of results to get in each request, if size is not explicitly set.

set_target(target)[source]

Return a searchable, with the target set.

If the target was already set to the same value, returns self. Otherwise, returns a copy of target.

search()[source]

Explicitly force a search for this query to be performed.

This ignores any cached results, and always makes a call to the server.

The query should usually be sliced before calling this method. If the slice does not specify an endpoint, the server will use its internal limit on the number of results, so only a small number of results will be returned unless a larger number is explictly set by slicing.

Returns:The results of the search.
total_docs[source]

Get the total number of documents searched.

matches_lower_bound[source]

A lower bound on the number of matches.

matches_estimated[source]

An estimate of the number of matches.

matches_upper_bound[source]

An upper bound on the number of matches.

estimate_is_exact[source]

True if the value returned by matches_estimated is exact, False if it isn’t (or at least, isn’t guaranteed to be).

check_at_least(check_at_least)[source]

Set the check_at_least value.

This is the minimum number of documents to try and check when running the search - useful mainly when you want reasonably accurate counts of matching documents, but don’t want to retrieve all matches.

Returns a new Search, with the check_at_least value to use when performing the search set to the specified value.

order_by(field, ascending=None)[source]

Set the sort order.

info[source]

Get the list of information items returned by the search.

calc_occur(prefix, doc_limit=None, result_limit=None, get_termfreqs=False, stopwords=[])[source]

Get occurrence counts of terms in the matching documents.

Warning - fairly slow.

Causes the search results to contain counts for each term seen, in decreasing order of occurrence. The count entries are of the form: [suffix, occurrence count] or [suffix, occurrence count, termfreq] if get_termfreqs was true.

Parameters:
  • prefix – prefix of terms to check occurrence for
  • doc_limit – number of matching documents to stop checking after. None=unlimited. Integer or None. Default=None
  • result_limit – number of terms to return results for. None=unlimited. Integer or None. Default=None
  • get_termfreqs – set to true to also get frequencies of terms in the db. Boolean. Default=False
  • stopwords – list of stopwords - term suffixes to ignore. Array of strings. Default=[]
calc_cooccur(prefix, doc_limit=None, result_limit=None, get_termfreqs=False, stopwords=[])[source]

Get cooccurrence counts of terms in the matching documents.

Warning - fairly slow (and O(L*L), where L is the average document length).

Causes the search results to contain counts for each pair of terms seen, in decreasing order of cooccurrence. The count entries are of the form: [suffix1, suffix2, co-occurrence count] or [suffix1, suffix2, co-occurrence count, termfreq of suffix1, termfreq of suffix2] if get_termfreqs was true.

Parameters:
  • prefix – prefix of terms to check co-occurrence for
  • doc_limit – number of matching documents to stop checking after. None=unlimited. Integer or None. Default=None
  • result_limit – number of terms to return results for. None=unlimited. Integer or None. Default=None
  • get_termfreqs – set to true to also get frequencies of terms in the db. Boolean. Default=False
  • stopwords – list of stopwords - term suffixes to ignore. Array of strings. Default=[]
class restpose.query.QueryIterator(query)[source]

Iterate over the results of a query.

next()
class restpose.query.Query(target=None)[source]

Base class of all queries.

All query subclasses should have a property called “_query”, containing the query as a structure ready to be converted to JSON and sent to the server.

filter(other)[source]

Return the results of this query filtered by another query.

This returns only documents which match both the original and the filter query, but uses only the weights from the original query.

Parameters:

other – The query to combine with this query.

Example :

A query returning documents in which the tag field contains the value 'foo', filtered to only include documents in which the tag field also contains the value 'bar'.

>>> query = Field('tag').equals('foo').filter(Field('tag').equals('bar'))
and_maybe(other)[source]

Return the results of this query, with additional weights from another query.

This returns exactly the documents which match the original query, but adds the weight from corresponding matches to the other query.

Parameters:

other – The query to combine with this query.

Example :

A query returning documents in which the tag field contains the value 'foo', but with additional weights for any matches containing the value 'bar'.

>>> query = Field('tag').equals('foo').and_maybe(Field('tag').equals('bar'))
class restpose.query.QueryField(fieldname, querytype, value, target=None)[source]

A query in a particular field.

class restpose.query.QueryMeta(querytype, value, target=None)[source]

A query for meta information (about field presence, errors, etc).

class restpose.query.QueryAll(target=None)[source]

A query which matches all documents.

class restpose.query.QueryNone(target=None)[source]

A query which matches no documents.

restpose.query.QueryNothing

alias of QueryNone

class restpose.query.CombinedQuery(*queries, **kwargs)[source]

Base class of Queries which are combinations of a sequence of queries.

Subclasses must define self._op, the operator to use to combine queries.

class restpose.query.And(*queries, **kwargs)[source]

A query which matches only the documents matched by all subqueries.

The weights are the sum of the weights in the subqueries.

Example :

A query returning documents in which the tag field contains both the value 'foo' and the value 'bar'.

>>> query = And(Field('tag').equals('foo'),
...             Field('tag').equals('bar'))
class restpose.query.Or(*queries, **kwargs)[source]

A query which matches the documents matched by any subquery.

The weights are the sum of the weights in the subqueries which match.

Example :

A query returning documents in which the tag field contains at least one of the value 'foo' or the value 'bar'.

>>> query = Or(Field('tag').equals('foo'),
...            Field('tag').equals('bar'))
class restpose.query.Xor(*queries, **kwargs)[source]

A query which matches the documents matched by an odd number of subqueries.

The weights are the sum of the weights in the subqueries which match.

Example :

A query returning documents in which the tag field contains exactly one of the value 'foo' or the value 'bar'.

>>> query = Xor(Field('tag').equals('foo'),
...             Field('tag').equals('bar'))
class restpose.query.AndNot(*queries, **kwargs)[source]

A query which matches the documents matched by the first subquery, but not any of the other subqueries.

The weights returned are the weights in the first subquery.

Example :

A query returning documents in which the tag field contains the value 'foo' but not the value 'bar'.

>>> query = AndNot(Field('tag').equals('foo'),
...                Field('tag').equals('bar'))
class restpose.query.Filter(*queries, **kwargs)[source]

A query which matches the documents matched by all the subqueries, but only returns weights from the first subquery.

Example :

A query returning documents in which the tag field contains the value 'foo', with weights from this match, but only where the tag field also contains the value 'bar'.

>>> query = Filter(Field('tag').equals('foo'),
...                Field('tag').equals('bar'))
class restpose.query.AndMaybe(*queries, **kwargs)[source]

A query which matches the documents matched by the first subquery, but adds additional weights from the other subqueries.

The weights are the sum of the weights in the subqueries.

Example :

A query returning documents in which the tag field contains the value 'foo', with weights from this match, but with additional weights for any of these documents in which the tag field contains the value 'bar'.

>>> query = AndMaybe(Field('tag').equals('foo'),
...                  Field('tag').equals('bar'))
class restpose.query.MultWeight(query, factor, target=None)[source]

A query which matches all the documents matched by another query, but with the weights multiplied by a factor.

Example :

A query returning documents in which the tag field contains the value 'foo', with weights multiplied by 2.5.

>>> query = MultWeight(Field('tag').equals('foo'), 2.5)

Build a query in which the weights are multiplied by a factor.

class restpose.query.TerminalQuery(orig, slice=None)[source]

A Query which has had offsets or additional search options set.

This is produced from a Query when additional search options are set. It can’t be combined with other Query objects, since the semantics of doing so would be confusing.

class restpose.query.SearchResult(rank, data)[source]
class restpose.query.SearchResults(raw)[source]

The results returned from the server when performing a search.

total_docs

The total number of documents searched.

offset

The offset of the first result item.

size_requested

The requested size.

check_at_least

The requested check_at_least value.

matches_lower_bound

A lower bound on the number of matches.

matches_estimated

An estimate of the number of matches.

matches_upper_bound

An upper bound on the number of matches.

estimate_is_exact[source]

Return True if the value returned by matches_estimated is exact, False if it isn’t (or at least, isn’t guaranteed to be).

items[source]

The matching result items.

info[source]

The list of information items returned from the server.

at_rank(rank)[source]

Get the result at a given rank.

The rank is the position in the entire result set, starting at 0.

Raises IndexError if the rank is out of the range in the result set.

Errors

Errors specific to RestPose.

exception restpose.errors.RestPoseError[source]
exception restpose.errors.CheckPointExpiredError[source]

An error raised when a checkpoint has expired.

Resource

Resources for RestPose.

This module provides a convenient interface to the resources exposed via HTTP by the RestPose server.

class restpose.resource.RestPoseResponse(connection, request, resp)[source]

A response from the RestPose server.

In addition to the properties exposed by restkit:restkit.Response, this exposes a json property, to decode JSON responses automatically.

json[source]

Get the response body as JSON.

Returns:The response body as a python object, decoded from JSON, if the response Content-Type was application/json.
Raises :an exception if the Content-Type is not application/json, or the body is not valid JSON.
Raises :RestPoseError if the status code returned is not one of the supplied status codes.
expect_status(*expected)[source]

Check that the status code is one of a set of expected codes.

Parameters:expected – The expected status codes.
Raises :RestPoseError if the status code returned is not one of the supplied status codes.
class restpose.resource.RestPoseResource(uri, **client_opts)[source]

A resource providing access to a RestPose server.

This may be subclassed and provided to restpose.Server, to allow requests to be monitored or modified. For example, a logging subclass could be used to record requests and their responses.

Initialise the resource.

Parameters:
  • uri – The full URI for the resource.
  • client_opts – Any options to be passed to restkit.Resource.
user_agent

The user agent to send when making requests.

request(method, path=None, payload=None, headers=None, **params)[source]

Perform a request.

Parameters:
  • method – the HTTP method to use, as a string.
  • path – The path to request.
  • payload – A payload to send as the request body; may be a file-like object, or a string, or a structure to send encoded as a JSON object.
  • headers – A dictionary of headers. If not already set, Accept and User-Agent headers will be added to this, and if there is a JSON payload, the Content-Type will be set to application/json.
  • params_dict – A dictionary of parameters to add to the requst URI.

Project Versions

Table Of Contents

Previous topic

Overview

Next topic

Documentation todos

This Page