From faa409a4c2a43a8f8ba3828e4312c8ca0f10d029 Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Wed, 3 Mar 2021 16:06:07 +0300 Subject: [PATCH 1/9] small bugfix --- doc/book/box/data_model.rst | 3 ++- doc/book/box/indexes.rst | 8 ++++---- doc/reference/reference_lua/box_space/create_index.rst | 4 ++-- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/book/box/data_model.rst b/doc/book/box/data_model.rst index cd35303a2c..c62172e711 100644 --- a/doc/book/box/data_model.rst +++ b/doc/book/box/data_model.rst @@ -386,7 +386,8 @@ Full information is in section | ``'string'`` | :ref:`string ` | TREE, BITSET or HASH | | (may also be called ``‘str’``) | | | +--------------------------------+-------------------------------------------+--------------------------------------+ - | ``'varbinary'`` | :ref:`varbinary ` | TREE, BITSET or HASH | + | ``'varbinary'`` | :ref:`varbinary ` | TREE, HASH or BITSET | + | | | (since version 2.7) | +--------------------------------+-------------------------------------------+--------------------------------------+ | ``'uuid'`` | :ref:`uuid ` | TREE or HASH | +--------------------------------+-------------------------------------------+--------------------------------------+ diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index 9b861250b9..c23f234259 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -181,8 +181,8 @@ Use HASH index: * if it is a secondary key * if you 100% won't need to make it non-unique -* if you really need that 2-5% performance improvement -* if you have taken measurements on your data and you see an increase in performance +* if you have taken measurements on your data and you see an accountable + increase in performance * if you save every byte on tuples (HASH is a little more compact) .. _indexes-rtree: @@ -644,5 +644,5 @@ And :doc:`insert ` another tuple: :class: fact You can add, drop, or alter the definitions at runtime, with some restrictions. - Read more in section :ref:`index operations ` - and in reference for :doc:`box.index submodule `. + Read more about index operations in reference for + :doc:`box.index submodule `. diff --git a/doc/reference/reference_lua/box_space/create_index.rst b/doc/reference/reference_lua/box_space/create_index.rst index 19dc13405a..6cb9a42871 100644 --- a/doc/reference/reference_lua/box_space/create_index.rst +++ b/doc/reference/reference_lua/box_space/create_index.rst @@ -172,7 +172,7 @@ and what index types are allowed. :header-rows: 1 * - Index field type - - What can be it it + - What can be in it - Where it is legal - Examples @@ -201,7 +201,7 @@ and what index types are allowed. `. A varbinary byte sequence does not have a :ref:`collation ` because its contents are not UTF-8 characters - - memtx TREE or HASH indexes; + - memtx TREE, HASH or BITSET (since version 2.7) indexes; vinyl TREE indexes - '\\65 \\66 \\67' From 7b694b929fe508bb135117b98942aaa458644139 Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Thu, 4 Mar 2021 17:54:29 +0300 Subject: [PATCH 2/9] revise content of indexes.rst --- doc/book/box/indexes.rst | 478 +++++++++++++++++++-------------------- 1 file changed, 228 insertions(+), 250 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index c23f234259..4c0274219d 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -28,49 +28,214 @@ Indexes have certain limitations. See details on page Creating an index -------------------------------------------------------------------------------- -An index definition may include identifiers of tuple fields and their expected -**types**. See allowed indexed field types in section -:ref:`Details about indexed field types `. +The simple :doc:`index-creation ` +operation is: -.. NOTE:: +.. cssclass:: highlight +.. parsed-literal:: - A recommended design pattern for a data model is to base primary keys on the - first fields of a tuple, because this speeds up tuple comparison. + :extsamp:`box.space.{**{space-name}**}:create_index('{*{index-name}*}')` -Let's look at an example where we first define the primary index (named 'primary') -based on field #1 of each tuple: +This creates a unique :ref:`TREE ` index on the first field +of all tuples (often called "Field#1"), which is assumed to be numeric. -.. code-block:: tarantoolsession +A recommended design pattern for a data model is to base primary keys on the +first fields of a tuple, because this speeds up tuple comparison. - tarantool> i = s:create_index('primary', {type = 'hash', parts = {{field = 1, type = 'unsigned'}}} +The simple :doc:`SELECT ` request is: -The effect is that, for all tuples in space 'tester', field #1 must exist and -must contain an unsigned integer. -The index type is HASH, so values in field #1 must be unique, because keys -in HASH indexes are unique. +.. cssclass:: highlight +.. parsed-literal:: -After that, let's define a secondary index (named 'secondary') based on field #2 -of each tuple: + :extsamp:`box.space.{**{space-name}**}:select({*{value}*})` -.. code-block:: tarantoolsession +This looks for a single tuple via the first index. Since the first index +is always unique, the maximum number of returned tuples will be 1. +You can call ``select()`` without arguments, and it will return all tuples. + +An index definition may also include identifiers of tuple fields +and their expected **types**. See allowed indexed field types in section +:ref:`Details about indexed field types `: - tarantool> i = s:create_index('secondary', {type = 'tree', parts = {field = 2, type = 'string'}}) +.. cssclass:: highlight +.. parsed-literal:: -The effect is that, for all tuples in space 'tester', field #2 must exist and -must contain a string. -The index type is TREE, so values in field #2 must not be unique, because keys -in TREE indexes may be non-unique. + :extsamp:`box.space.{**{space-name}**}:create_index('primary', {type = 'hash', parts = {{field = 1, type = 'unsigned'}}}` Space definitions and index definitions are stored permanently in Tarantool's -system spaces :ref:`_space ` and :ref:`_index ` -(for details, see reference on :ref:`box.space ` submodule). +system spaces :ref:`_space ` and :ref:`_index `. .. admonition:: Tip :class: fact - The full information about creating an index is in section + See full information about creating indexes, such as + how to create an index using the ``path`` option, or + how to create a functional index in our reference for :doc:`/reference/reference_lua/box_space/create_index`. +.. _index-box_index-operations: + +-------------------------------------------------------------------------------- +Index operations +-------------------------------------------------------------------------------- + +Index operations are automatic: if a data-manipulation request changes a tuple, +then it also changes the index keys defined for the tuple. + +#. For further demonstrations let's create a sample space named 'tester' and + put it in a variable 'my_space': + + .. code-block:: tarantoolsession + + tarantool> my_space = box.schema.space.create('tester') + +#. Then format the created space by specifying field names and types: + + .. code-block:: tarantoolsession + + tarantool> my_space:format({ + > {name = 'id', type = 'unsigned'}, + > {name = 'band_name', type = 'string'}, + > {name = 'year', type = 'unsigned'}, + > {name = 'rate', type = 'unsigned', is_nullable=true}}) + +#. Create the **primary** index (named ``primary``): + + .. code-block:: tarantoolsession + + tarantool> my_space:create_index('primary', { + > type = 'hash', + > parts = {'id'} + > }) + + This is a primary index based on the ``id`` field of each tuple. + +#. And insert some :ref:`tuples ` (that are records in Tarantool) + into the space: + + .. code-block:: tarantoolsession + + tarantool> my_space:insert{1, 'Roxette', 1986, 1} + tarantool> my_space:insert{2, 'Scorpions', 2015, 4} + tarantool> my_space:insert{3, 'Ace of Base', 1993} + tarantool> my_space:insert{4, 'Roxette', 2016, 3} + +#. Create a **secondary index**: + + .. code-block:: tarantoolsession + + tarantool> box.space.tester:create_index('secondary', {parts = {{field=3, type='unsigned'}}}) + --- + - unique: true + parts: + - type: unsigned + is_nullable: false + fieldno: 3 + id: 2 + space_id: 512 + type: TREE + name: secondary + ... + +#. Create a **multi-part index** with three parts: + + .. code-block:: tarantoolsession + + tarantool> box.space.tester:create_index('thrine', {parts = {{field = 2, type = 'string'}, {field=3, type='unsigned'}, {field=4, type='unsigned'}}}) + --- + - unique: true + parts: + - type: string + is_nullable: false + fieldno: 2 + - type: unsigned + is_nullable: false + fieldno: 3 + - type: unsigned + is_nullable: true + fieldno: 4 + id: 6 + space_id: 513 + type: TREE + name: thrine + ... + +**There are the following SELECT variations:** + +#. The search can use comparisons other than equality: + + .. code-block:: tarantoolsession + + tarantool> box.space.tester:select(1, {iterator = 'GT'}) + --- + - - [2, 'Scorpions', 2015, 4] + - [3, 'Ace of Base', 1993] + - [4, 'Roxette', 2016, 3] + ... + + The :ref:`comparison operators ` are LT, LE, EQ, + REQ, GE, GT (for "less than", "less than or equal", "equal", "reversed equal", + "greater than or equal", "greater than" respectively). + Comparisons make sense if and only if the index type is TREE. + + Note that we didn't use the name of the index, which means we use primary index here. + + This type of search may return more than one tuple; if so, the tuples will be + in descending order by key when the comparison operator is LT or LE or REQ, + otherwise in ascending order. + +#. The search can use a **secondary index**. + + For a primary-key search, it is optional to specify an index name as + was demonstrated above. + For a secondary-key search, it is mandatory. + + .. code-block:: tarantoolsession + + tarantool> box.space.tester.index.secondary:select({1993}) + --- + - - [3, 'Ace of Base', 1993] + ... + + .. _partial_key_search: + +#. **Partial key search:** The search may be for some key parts starting with + the prefix of the key. Notice that partial key searches are available + only in TREE indexes. + + .. code-block:: tarantoolsession + + tarantool> box.space.tester.index.thrine:select({'Scorpions', 2015}) + --- + - - [2, 'Scorpions', 2015, 4] + ... + +#. The search can be for all fields, using a table as the value: + + .. code-block:: tarantoolsession + + tarantool> box.space.tester.index.thrine:select({'Roxette', 2016, 3}) + --- + - - [4, 'Roxette', 2016, 3] + ... + + or the search can be for one field, using a table or a scalar: + + .. code-block:: tarantoolsession + + tarantool> box.space.tester.index.thrine:select({'Roxette'}) + --- + - - [1, 'Roxette', 1986, 5] + - [4, 'Roxette', 2016, 3] + ... + +.. admonition:: Tip + :class: fact + + You can also add, drop, or alter the definitions at runtime, with some restrictions. + Read more about index operations in reference for + :doc:`box.index submodule `. + -------------------------------------------------------------------------------- Index types -------------------------------------------------------------------------------- @@ -205,9 +370,10 @@ RTREE index can accept two types of ``distance`` functions: ``euclid`` and ``man .. code-block:: lua - s = box.schema.create_space("test") - i = s:create_index('primary', { type = 'HASH', parts = {1, 'num'} }) - r = s:create_index('spatial', { type = 'RTREE', unique = false, parts = {2, 'array'} }) + my_space = box.schema.create_space("test") + my_space:format{ { type= 'number', name='id' }, { type='array', name='content' } } + hash_index = my_space:create_index('primary', { type = 'HASH', parts = {'id'} }) + rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, parts = {'content'} }) Corresponding tuple field thus must be an array of 2 or 4 numbers. 2 numbers mean a point {x, y}; @@ -216,8 +382,8 @@ where (x1, y1) and (x2, y2) - diagonal point of the rectangle. .. code-block:: lua - s:insert{1, {1, 1}} - s:insert{2, {2, 2, 3, 3}} + my_space:insert{1, {1, 1}} + my_space:insert{2, {2, 2, 3, 3}} Selection results depend on a chosen iterator. The default EQ iterator searches for an exact rectangle, @@ -225,22 +391,22 @@ a point is treated as zero width and height rectangle: .. code-block:: tarantoolsession - tarantool> r:select{1, 1} + tarantool> rtree_index:select{1, 1} --- - - [1, [1, 1]] ... - tarantool> r:select{1, 1, 1, 1} + tarantool> rtree_index:select{1, 1, 1, 1} --- - - [1, [1, 1]] ... - tarantool> r:select{2, 2} + tarantool> rtree_index:select{2, 2} --- - [] ... - tarantool> r:select{2, 2, 3, 3} + tarantool> rtree_index:select{2, 2, 3, 3} --- - - [2, [2, 2, 3, 3]] ... @@ -250,7 +416,7 @@ selects all tuples in arbitrary order: .. code-block:: tarantoolsession - tarantool> r:select{} + tarantool> rtree_index:select{} --- - - [1, [1, 1]] - [2, [2, 2, 3, 3]] @@ -261,7 +427,7 @@ within a specified rectangle: .. code-block:: tarantoolsession - tarantool> r:select({1, 1, 2, 2}, {iterator='le'}) + tarantool> rtree_index:select({1, 1, 2, 2}, {iterator='le'}) --- - - [1, [1, 1]] ... @@ -271,7 +437,7 @@ with their rectangles strictly within a specified rectangle: .. code-block:: tarantoolsession - tarantool> r:select({0, 0, 3, 3}, {iterator='lt'}) + tarantool> rtree_index:select({0, 0, 3, 3}, {iterator='lt'}) --- - - [1, [1, 1]] ... @@ -280,7 +446,7 @@ Iterator GE searches for tuples with a specified rectangle within their rectangl .. code-block:: tarantoolsession - tarantool> r:select({1, 1}, {iterator='ge'}) + tarantool> rtree_index:select({1, 1}, {iterator='ge'}) --- - - [1, [1, 1]] ... @@ -289,7 +455,7 @@ Iterator GT searches for tuples with a specified rectangle strictly within their .. code-block:: tarantoolsession - tarantool> r:select({2.1, 2.1, 2.9, 2.9}, {itearator='gt'}) + tarantool> rtree_index:select({2.1, 2.1, 2.9, 2.9}, {itearator='gt'}) --- - [] ... @@ -298,7 +464,7 @@ Iterator OVERLAPS searches for tuples with their rectangles overlapping specifie .. code-block:: tarantoolsession - tarantool> r:select({0, 0, 10, 2}, {iterator='overlaps'}) + tarantool> rtree_index:select({0, 0, 10, 2}, {iterator='overlaps'}) --- - - [1, [1, 1]] - [2, [2, 2, 3, 3]] @@ -310,13 +476,13 @@ Iterator NEIGHBOR searches for all tuples and orders them by distance to the spe tarantool> for i=1,10 do > for j=1,10 do - > s:insert{i*10+j, {i, j, i+1, j+1}} + > my_space:insert{i*10+j, {i, j, i+1, j+1}} > end > end --- ... - tarantool> r:select({1, 1}, {iterator='neighbor', limit=5}) + tarantool> rtree_index:select({1, 1}, {iterator='neighbor', limit=5}) --- - - [11, [1, 1, 2, 2]] - [12, [1, 2, 2, 3]] @@ -333,39 +499,24 @@ Here's short example of using 4D tree: .. code-block:: tarantoolsession - tarantool> s = box.schema.create_space('test') - --- - ... - - tarantool> i = s:create_index('primary', { type = 'HASH', parts = {1, 'num'} }) - --- - ... - - tarantool> r = s:create_index('spatial', { type = 'RTREE', unique = false, dimension = 4, parts = {2, 'array'} }) - --- - ... - - tarantool> s:insert{1, {1, 2, 3, 4}} -- insert 4D point - --- - - [1, [1, 2, 3, 4]] - ... - - tarantool> s:insert{2, {1, 1, 1, 1, 2, 2, 2, 2}} -- insert 4D box - --- - - [2, [1, 1, 1, 1, 2, 2, 2, 2]] - ... + tarantool> my_space = box.schema.create_space("test") + tarantool> my_space:format{ { type= 'number', name='id' }, { type='array', name='content' } } + tarantool> hash_index = my_space:create_index('primary', { type = 'HASH', parts = {'id'} }) + tarantool> rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, dimension = 4, parts = {'content'} }) + tarantool> my_space:insert{1, {1, 2, 3, 4}} -- insert 4D point + tarantool> my_space:insert{2, {1, 1, 1, 1, 2, 2, 2, 2}} -- insert 4D box - tarantool> r:select{1, 2, 3, 4} -- find exact point + tarantool> rtree_index:select{1, 2, 3, 4} -- find exact point --- - - [1, [1, 2, 3, 4]] ... - tarantool> r:select({0, 0, 0, 0, 3, 3, 3, 3}, {iterator = 'LE'}) -- select from 4D box + tarantool> rtree_index:select({0, 0, 0, 0, 3, 3, 3, 3}, {iterator = 'LE'}) -- select from 4D box --- - - [2, [1, 1, 1, 1, 2, 2, 2, 2]] ... - tarantool> r:select({0, 0, 0, 0}, {iterator = 'neighbor'}) -- select neighbours + tarantool> rtree_index:select({0, 0, 0, 0}, {iterator = 'neighbor'}) -- select neighbours --- - - [2, [1, 1, 1, 1, 2, 2, 2, 2]] - [1, [1, 2, 3, 4]] @@ -378,7 +529,7 @@ Here's short example of using 4D tree: that could be tons of data with corresponding performance. And another frequent mistake is to specify iterator type without quotes, - in such way: ``r:select(rect, {iterator = LE})``. + in such way: ``rtree_index:select(rect, {iterator = LE})``. This leads to silent EQ select, because ``LE`` is undefined variable and treated as nil, so iterator is unset and default used. @@ -400,41 +551,41 @@ and bit values are entered as hexadecimal literals for easier reading. .. code-block:: tarantoolsession - tarantool> s = box.schema.space.create('space_with_bitset') - tarantool> s:create_index('primary_index', { + tarantool> my_space = box.schema.space.create('space_with_bitset') + tarantool> my_space:create_index('primary_index', { > parts = {1, 'string'}, > unique = true, > type = 'TREE' > }) - tarantool> s:create_index('bitset_index', { + tarantool> my_space:create_index('bitset_index', { > parts = {2, 'unsigned'}, > unique = false, > type = 'BITSET' > }) - tarantool> s:insert{'Tuple with bit value = 01', 0x01} - tarantool> s:insert{'Tuple with bit value = 10', 0x02} - tarantool> s:insert{'Tuple with bit value = 11', 0x03} - tarantool> s.index.bitset_index:select(0x02, { + tarantool> my_space:insert{'Tuple with bit value = 01', 0x01} + tarantool> my_space:insert{'Tuple with bit value = 10', 0x02} + tarantool> my_space:insert{'Tuple with bit value = 11', 0x03} + tarantool> my_space.index.bitset_index:select(0x02, { > iterator = box.index.EQ > }) --- - - ['Tuple with bit value = 10', 2] ... - tarantool> s.index.bitset_index:select(0x02, { + tarantool> my_space.index.bitset_index:select(0x02, { > iterator = box.index.BITS_ANY_SET > }) --- - - ['Tuple with bit value = 10', 2] - ['Tuple with bit value = 11', 3] ... - tarantool> s.index.bitset_index:select(0x02, { + tarantool> my_space.index.bitset_index:select(0x02, { > iterator = box.index.BITS_ALL_SET > }) --- - - ['Tuple with bit value = 10', 2] - ['Tuple with bit value = 11', 3] ... - tarantool> s.index.bitset_index:select(0x02, { + tarantool> my_space.index.bitset_index:select(0x02, { > iterator = box.index.BITS_ALL_NOT_SET > }) --- @@ -473,176 +624,3 @@ specific to an index type. For example, they can be used for evaluating Boolean expressions when traversing BITSET indexes, or for going in descending order when traversing TREE indexes. - -.. _index-box_index-operations: - --------------------------------------------------------------------------------- -Index operations --------------------------------------------------------------------------------- - -Index operations are automatic: if a data-manipulation request changes a tuple, -then it also changes the index keys defined for the tuple. - -The simple :doc:`index-creation ` -operation that we've illustrated before is: - -.. cssclass:: highlight -.. parsed-literal:: - - :samp:`box.space.{space-name}:create_index('{index-name}')` - -This creates a unique TREE index on the first field of all tuples -(often called "Field#1"), which is assumed to be numeric. - -The simple :doc:`SELECT ` request -that we've illustrated before is: - -.. cssclass:: highlight -.. parsed-literal:: - - :extsamp:`box.space.{*{space-name}*}:select({*{value}*})` - -This looks for a single tuple via the first index. Since the first index -is always unique, the maximum number of returned tuples will be 1. -You can call ``select()`` without arguments, and it will return all tuples. - -Let's continue working with the space 'tester' created in the :ref:`"Getting -started" exercises ` but first modify it via -:doc:`format() `: - -.. code-block:: tarantoolsession - - tarantool> box.space.tester:format({ - > {name = 'id', type = 'unsigned'}, - > {name = 'band_name', type = 'string'}, - > {name = 'year', type = 'unsigned'}, - > {name = 'rate', type = 'unsigned', is_nullable=true}}) - --- - ... - -Add the rate to the tuple #1 and #2 via -:doc:`update function `: - -.. code-block:: tarantoolsession - - tarantool> box.space.tester:update(1, {{'=', 4, 5}}) - --- - - [1, 'Roxette', 1986, 5] - ... - tarantool> box.space.tester:update(2, {{'=', 4, 4}}) - --- - - [2, 'Scorpions', 2015, 4] - ... - -And :doc:`insert ` another tuple: - -.. code-block:: tarantoolsession - - tarantool> box.space.tester:insert({4, 'Roxette', 2016, 3}) - --- - - [4, 'Roxette', 2016, 3] - ... - -**The existing SELECT variations:** - -1. The search can use comparisons other than equality. - - .. code-block:: tarantoolsession - - tarantool> box.space.tester:select(1, {iterator = 'GT'}) - --- - - - [2, 'Scorpions', 2015, 4] - - [3, 'Ace of Base', 1993] - - [4, 'Roxette', 2016, 3] - ... - - The :ref:`comparison operators ` are LT, LE, EQ, REQ, GE, GT - (for "less than", "less than or equal", "equal", "reversed equal", - "greater than or equal", "greater than" respectively). - Comparisons make sense if and only if the index type is TREE. - - This type of search may return more than one tuple; if so, the tuples will be - in descending order by key when the comparison operator is LT or LE or REQ, - otherwise in ascending order. - -2. The search can use a secondary index. - - For a primary-key search, it is optional to specify an index name. - For a secondary-key search, it is mandatory. - - .. code-block:: tarantoolsession - - tarantool> box.space.tester:create_index('secondary', {parts = {{field=3, type='unsigned'}}}) - --- - - unique: true - parts: - - type: unsigned - is_nullable: false - fieldno: 3 - id: 2 - space_id: 512 - type: TREE - name: secondary - ... - tarantool> box.space.tester.index.secondary:select({1993}) - --- - - - [3, 'Ace of Base', 1993] - ... - - .. _partial_key_search: - -3. The search may be for some key parts starting with the prefix of - the key. Notice that partial key searches are available only in TREE indexes. - - .. code-block:: tarantoolsession - - -- Create an index with three parts - tarantool> box.space.tester:create_index('tertiary', {parts = {{field = 2, type = 'string'}, {field=3, type='unsigned'}, {field=4, type='unsigned'}}}) - --- - - unique: true - parts: - - type: string - is_nullable: false - fieldno: 2 - - type: unsigned - is_nullable: false - fieldno: 3 - - type: unsigned - is_nullable: true - fieldno: 4 - id: 6 - space_id: 513 - type: TREE - name: tertiary - ... - -- Make a partial search - tarantool> box.space.tester.index.tertiary:select({'Scorpions', 2015}) - --- - - - [2, 'Scorpions', 2015, 4] - ... - -4. The search may be for all fields, using a table for the value: - - .. code-block:: tarantoolsession - - tarantool> box.space.tester.index.tertiary:select({'Roxette', 2016, 3}) - --- - - - [4, 'Roxette', 2016, 3] - ... - - or the search can be for one field, using a table or a scalar: - - .. code-block:: tarantoolsession - - tarantool> box.space.tester.index.tertiary:select({'Roxette'}) - --- - - - [1, 'Roxette', 1986, 5] - - [4, 'Roxette', 2016, 3] - ... - -.. admonition:: Tip - :class: fact - - You can add, drop, or alter the definitions at runtime, with some restrictions. - Read more about index operations in reference for - :doc:`box.index submodule `. From 4d0dd9d39c473cbed0aa003924a63afd376aef35 Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Wed, 10 Mar 2021 15:22:42 +0300 Subject: [PATCH 3/9] revise subsections in create_index.rst --- doc/book/box/indexes.rst | 3 + .../reference_lua/box_space/create_index.rst | 198 ++++++++++-------- 2 files changed, 112 insertions(+), 89 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index 4c0274219d..b8ba3b1d19 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -28,6 +28,9 @@ Indexes have certain limitations. See details on page Creating an index -------------------------------------------------------------------------------- +It is mandatory to create an index for a space before trying to insert +tuples into it, or select tuples from it. + The simple :doc:`index-creation ` operation is: diff --git a/doc/reference/reference_lua/box_space/create_index.rst b/doc/reference/reference_lua/box_space/create_index.rst index 6cb9a42871..ab01ffeb3b 100644 --- a/doc/reference/reference_lua/box_space/create_index.rst +++ b/doc/reference/reference_lua/box_space/create_index.rst @@ -52,7 +52,7 @@ On this page: **Options for space_object:create_index()** - .. container:: table + .. container:: table .. rst-class:: left-align-column-1 .. rst-class:: left-align-column-2 @@ -129,10 +129,10 @@ On this page: .. code-block:: tarantoolsession - tarantool> s=box.schema.space.create('tester') + tarantool> my_space = box.schema.space.create('tester') --- ... - tarantool> s:create_index('primary', {unique = true, parts = { + tarantool> my_space:create_index('primary', {unique = true, parts = { > {field = 1, type = 'unsigned'}, > {field = 2, type = 'string'} > }}) @@ -305,8 +305,10 @@ Allowing null for an indexed key If the index type is TREE, and the index is not the primary index, then the ``parts={...}`` clause may include ``is_nullable=true`` or -``is_nullable=false`` (the default). If ``is_nullable`` is true, -then it is legal to insert ``nil`` or an equivalent such as ``msgpack.NULL``. +``is_nullable=false`` (the default). + +If ``is_nullable`` is true, then it is legal to insert ``nil`` or an equivalent +such as ``msgpack.NULL``. It is also legal to insert nothing at all when using trailing nullable fields. Within indexes, such "null values" are always treated as equal to other null values, and are always treated as less than non-null values. @@ -314,7 +316,7 @@ Nulls may appear multiple times even in a unique index. Example: .. code-block:: lua - box.space.tester:create_index('I',{unique=true,parts={{field = 2, type = 'number', is_nullable = true}}}) + box.space.tester:create_index('I', {unique = true, parts = {{field = 2, type = 'number', is_nullable = true}}}) .. WARNING:: @@ -331,9 +333,9 @@ Nulls may appear multiple times even in a unique index. Example: Creating an index using field names instead of field numbers -------------------------------------------------------------------------------- -``create_index()`` can use -field names and/or field types described by the optional +``create_index()`` can use field names and/or field types described by the optional :doc:`/reference/reference_lua/box_space/format` clause. + In the following example, we show ``format()`` for a space that has two columns named 'x' and 'y', and then we show five variations of the ``parts={}`` clause of ``create_index()``, @@ -342,17 +344,22 @@ The variations include omitting the type, using numbers, and adding extra braces .. code-block:: lua - box.space.tester:format({{name='x', type='scalar'}, {name='y', type='integer'}}) - box.space.tester:create_index('I2',{parts={{'x', 'scalar'}}}) - box.space.tester:create_index('I3',{parts={{'x','scalar'},{'y','integer'}}}) - box.space.tester:create_index('I4',{parts={{1,'scalar'}}}) - box.space.tester:create_index('I5',{parts={{1,'scalar'},{2,'integer'}}}) - box.space.tester:create_index('I6',{parts={1}}) - box.space.tester:create_index('I7',{parts={1,2}}) - box.space.tester:create_index('I8',{parts={'x'}}) - box.space.tester:create_index('I9',{parts={'x','y'}}) - box.space.tester:create_index('I10',{parts={{'x'}}}) - box.space.tester:create_index('I11',{parts={{'x'},{'y'}}}) + box.space.tester:format({{name = 'x', type = 'scalar'}, {name = 'y', type = 'integer'}}) + + box.space.tester:create_index('I2', {parts = {{'x', 'scalar'}}}) + box.space.tester:create_index('I3', {parts = {{'x', 'scalar'}, {'y', 'integer'}}}) + + box.space.tester:create_index('I4', {parts = {{1, 'scalar'}}}) + box.space.tester:create_index('I5', {parts = {{1, 'scalar'}, {2, 'integer'}}}) + + box.space.tester:create_index('I6', {parts = {1}}) + box.space.tester:create_index('I7', {parts = {1, 2}}) + + box.space.tester:create_index('I8', {parts = {'x'}}) + box.space.tester:create_index('I9', {parts = {'x', 'y'}}) + + box.space.tester:create_index('I10', {parts = {{'x'}}}) + box.space.tester:create_index('I11', {parts = {{'x'}, {'y'}}}) .. _box_space-path: @@ -361,32 +368,39 @@ Creating an index using the path option for map fields (JSON-path indexes) -------------------------------------------------------------------------------- To create an index for a field that is a map (a path string and a scalar value), -specify the path string during index_create, that is, -:code:`parts={` :samp:`{field-number},'{data-type}',path = '{path-name}'` :code:`}`. -The index type must be ``'tree'`` or ``'hash'`` and the field's contents +specify the path string during index creation, like this: + +.. cssclass:: highlight +.. parsed-literal:: + + :extsamp:`parts = {{*{field-number}*}, {*{'data-type'}*}, path = {*{'path-name'}*}}` + +The index type must be ``tree`` or ``hash`` and the contents of the field must always be maps with the same path. **Example 1 -- The simplest use of path:** -.. code-block:: lua +.. code-block:: tarantoolsession - -- Result will be - - [{'age': 44}] - box.schema.space.create('T') - box.space.T:create_index('I',{parts={{field = 1, type = 'scalar', path = 'age'}}}) - box.space.T:insert{{age=44}} - box.space.T:select(44) + tarantool> box.schema.space.create('T') + tarantool> box.space.T:create_index('I',{parts = {{field = 1, type = 'scalar', path = 'age'}}}) + tarantool> box.space.T:insert({{age = 44}}) + tarantool> box.space.T:select(44) + --- + - [{'age': 44}] **Example 2 -- path plus format() plus JSON syntax to add clarity:** .. code-block:: lua - -- Result will be: - [1, {'FIO': {'surname': 'Xi', 'firstname': 'Ahmed'}}] - s = box.schema.space.create('T') - format = {{'id', 'unsigned'}, {'data', 'map'}} - s:format(format) - parts = {{'data.FIO["firstname"]', 'str'}, {'data.FIO["surname"]', 'str'}} - i = s:create_index('info', {parts = parts}) - s:insert({1, {FIO={firstname='Ahmed', surname='Xi'}}}) + tarantool> my_space = box.schema.space.create('T') + tarantool> format = {{'id', 'unsigned'}, {'data', 'map'}} + tarantool> my_space:format(format) + tarantool> parts = {{'data.FIO["firstname"]', 'str'}, {'data.FIO["surname"]', 'str'}} + tarantool> my_index = my_space:create_index('info', {parts = parts}) + tarantool> my_space:insert({1, {FIO = {firstname = 'Ahmed', surname = 'Xi'}}}) + --- + - [1, {'FIO': {'surname': 'Xi', 'firstname': 'Ahmed'}}] **Note re storage engine:** vinyl supports only the TREE index type, and vinyl secondary indexes must be created before tuples are inserted. @@ -394,17 +408,21 @@ secondary indexes must be created before tuples are inserted. .. _box_space-path_multikey: -------------------------------------------------------------------------------- -Creating a multikey index using the path option with [*] +Creating a multikey index using the path option with wildcard [*] -------------------------------------------------------------------------------- -The string in a path option can contain '[*]' which is called -an array index placeholder. Indexes defined with this are useful +The string in a path option can contain ``[*]`` which is called +**an array index placeholder**. Indexes defined with this are useful for JSON documents that all have the same structure. For example, when creating an index on field#2 for a string document that will start with ``{'data': [{'name': '...'}, {'name': '...'}]``, -the parts section in the create_index request could look like: -``parts = {{field = 2, type = 'str', path = 'data[*].name'}}``. +the parts section in the ``create_index`` request could look like: + +.. code-block:: lua + + parts = {{field = 2, type = 'str', path = 'data[*].name'}} + Then tuples containing names can be retrieved quickly with ``index_object:select({key-value})``. @@ -414,20 +432,20 @@ which both match the request: .. code-block:: lua - s = box.schema.space.create('json_documents') - s:create_index('primarykey') - i = s:create_index('multikey', {parts = {{field = 2, type = 'str', path = 'data[*].name'}}}) - s:insert({1, - {data = {{name='A'}, - {name='B'}}, + my_space = box.schema.space.create('json_documents') + my_space:create_index('primary') + multikey_index = my_space:create_index('multikey', {parts = {{field = 2, type = 'str', path = 'data[*].name'}}}) + my_space:insert({1, + {data = {{name = 'A'}, + {name = 'B'}}, extra_field = 1}}) - i:select({''},{iterator='GE'}) + multikey_index:select({''}, {iterator = 'GE'}) The result of the select request looks like this: .. code-block:: tarantoolsession - tarantool> i:select({''},{iterator='GE'}) + tarantool> multikey_index:select({''},{iterator='GE'}) --- - - [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}] - [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}] @@ -435,16 +453,15 @@ The result of the select request looks like this: Some restrictions exist: -* '[*]' must be alone or must be at the end of a name in the path -* '[*]' must not appear twice in the path -* if an index has a path with x[*] then no other index can have a path with +* ``[*]`` must be alone or must be at the end of a name in the path +* ``[*]`` must not appear twice in the path +* if an index has a path with ``x[*]`` then no other index can have a path with x.component -* '[*]' must not appear in the path of a primary-key -* if an index has ``unique=true`` and has a path with '[*]' +* ``[*]`` must not appear in the path of a primary-key +* if an index has ``unique=true`` and has a path with ``[*]`` then duplicate keys from different tuples are disallowed but duplicate keys for the same tuple are allowed -* as with :ref:`Using the path option for map fields `, - the field's value must have the structure that the path definition implies, +* the field's value must have the same structure as in the path definition, or be nil (nil is not indexed) .. _box_space-index_func: @@ -458,30 +475,33 @@ the index key, rather than depending entirely on the Tarantool default formation Functional indexes are useful for condensing or truncating or reversing or any other way that users want to customize the index. -The function definition must expect a tuple (which has the contents of -fields at the time a data-change request happens) and must return a tuple -(which has the contents that will actually be put in the index). +There are several recommendations on building functional indexes: + +* The function definition must expect a tuple, which has the contents of + fields at the time a data-change request happens, and must return a tuple, + which has the contents that will actually be put in the index. -The space must have a memtx engine. +* The ``create_index`` definition must include specification of all key parts, + and the custom function must return a table which has the same number of key + parts with the same types. -The function must be :ref:`persistent ` -and deterministic. +* The space must have a memtx engine. -The key parts must not depend on JSON paths. +* The function must be persistent and deterministic + (see :ref:`Creating function with body`). -The ``create_index`` definition must include specification of all key parts, -and the function must return a table which has the same number of key parts -with the same types. +* The key parts must not depend on JSON paths. -The function must access key-part values by index, not by field name. +* The function must access key-part values by index, not by field name. -Functional indexes must not be primary-key indexes. +* Functional indexes must not be primary-key indexes. -Functional indexes cannot be altered and the function cannot be changed if -it is used for an index, so the only way to change them is to drop the index -and create it again. +* Functional indexes cannot be altered and the function cannot be changed if + it is used for an index, so the only way to change them is to drop the index + and create it again. -Only sandboxed functions are suitable for functional indexes. +* Only :ref:`sandboxed ` functions + are suitable for functional indexes. **Example:** @@ -492,8 +512,8 @@ A function could make a key using only the first letter of a string field. .. code-block:: lua - box.schema.space.create('x', {engine = 'memtx'}) - box.space.x:create_index('i',{parts={{field = 1, type = 'string'}}}) + box.schema.space.create('tester', {engine = 'memtx'}) + box.space.tester:create_index('i',{parts={{field = 1, type = 'string'}}}) #. Make a function. The function expects a tuple. In this example it will work on tuple[2] because the key source is field number 2 in what we will @@ -507,7 +527,7 @@ A function could make a key using only the first letter of a string field. .. code-block:: lua - box.schema.func.create('F', + box.schema.func.create('my_func', {body = lua_code, is_deterministic = true, is_sandboxed = true}) #. Make a functional index. Specify the fields whose values will be passed @@ -515,7 +535,7 @@ A function could make a key using only the first letter of a string field. .. code-block:: lua - box.space.x:create_index('j',{parts={{field = 1, type = 'string'}},func = 'F'}) + box.space.tester:create_index('func_idx',{parts={{field = 1, type = 'string'}},func = 'my_func'}) #. Test. Insert a few tuples. Select using only the first letter, it will work because that is the key. Or, select using the same function as was used for @@ -523,20 +543,20 @@ A function could make a key using only the first letter of a string field. .. code-block:: lua - box.space.x:insert{'a', 'wombat'} - box.space.x:insert{'b', 'rabbit'} - box.space.x.index.j:select('w') - box.space.x.index.j:select(box.func.F:call({{'x', 'wombat'}})); + box.space.tester:insert({'a', 'wombat'}) + box.space.tester:insert({'b', 'rabbit'}) + box.space.tester.index.func_idx:select('w') + box.space.tester.index.func_idx:select(box.func.my_func:call({{'tester', 'wombat'}})); The results of the two ``select`` requests will look like this: .. code-block:: tarantoolsession - tarantool> box.space.x.index.j:select('w') + tarantool> box.space.tester.index.func_idx:select('w') --- - - ['a', 'wombat'] ... - tarantool> box.space.x.index.j:select(box.func.F:call({{'x','wombat'}})); + tarantool> box.space.tester.index.func_idx:select(box.func.my_func:call({{'tester','wombat'}})); --- - - ['a', 'wombat'] ... @@ -545,16 +565,16 @@ Here is the full code of the example: .. code-block:: lua - box.schema.space.create('x', {engine = 'memtx'}) - box.space.x:create_index('i',{parts={{field = 1, type = 'string'}}}) + box.schema.space.create('tester', {engine = 'memtx'}) + box.space.tester:create_index('i',{parts={{field = 1, type = 'string'}}}) lua_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]] - box.schema.func.create('F', + box.schema.func.create('my_func', {body = lua_code, is_deterministic = true, is_sandboxed = true}) - box.space.x:create_index('j',{parts={{field = 1, type = 'string'}},func = 'F'}) - box.space.x:insert{'a', 'wombat'} - box.space.x:insert{'b', 'rabbit'} - box.space.x.index.j:select('w') - box.space.x.index.j:select(box.func.F:call({{'x', 'wombat'}})); + box.space.tester:create_index('func_idx',{parts={{field = 1, type = 'string'}},func = 'my_func'}) + box.space.tester:insert({'a', 'wombat'}) + box.space.tester:insert({'b', 'rabbit'}) + box.space.tester.index.func_idx:select('w') + box.space.tester.index.func_idx:select(box.func.my_func:call({{'tester', 'wombat'}})); Functions for functional indexes can return **multiple keys**. Such functions are called "multikey" functions. From 131f789ac6d0f22611596748346cf23cc119412b Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva <37013254+Onvember@users.noreply.github.com> Date: Wed, 10 Mar 2021 15:31:12 +0300 Subject: [PATCH 4/9] Apply suggestions from code review Co-authored-by: Nick Volynkin --- doc/book/box/indexes.rst | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index b8ba3b1d19..38b4b464bc 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -82,17 +82,17 @@ system spaces :ref:`_space ` and :ref:`_index Index operations -------------------------------------------------------------------------------- -Index operations are automatic: if a data-manipulation request changes a tuple, +Index operations are automatic: if a data manipulation request changes a tuple, then it also changes the index keys defined for the tuple. -#. For further demonstrations let's create a sample space named 'tester' and - put it in a variable 'my_space': +#. For further demonstrations let's create a sample space named ``tester`` and + put it in a variable ``my_space``: .. code-block:: tarantoolsession tarantool> my_space = box.schema.space.create('tester') -#. Then format the created space by specifying field names and types: +#. Format the created space by specifying field names and types: .. code-block:: tarantoolsession @@ -113,8 +113,7 @@ then it also changes the index keys defined for the tuple. This is a primary index based on the ``id`` field of each tuple. -#. And insert some :ref:`tuples ` (that are records in Tarantool) - into the space: +#. Insert some :ref:`tuples ` into the space: .. code-block:: tarantoolsession @@ -165,7 +164,7 @@ then it also changes the index keys defined for the tuple. **There are the following SELECT variations:** -#. The search can use comparisons other than equality: +* The search can use **comparisons** other than equality: .. code-block:: tarantoolsession @@ -176,11 +175,16 @@ then it also changes the index keys defined for the tuple. - [4, 'Roxette', 2016, 3] ... - The :ref:`comparison operators ` are LT, LE, EQ, - REQ, GE, GT (for "less than", "less than or equal", "equal", "reversed equal", - "greater than or equal", "greater than" respectively). + The :ref:`comparison operators ` are: + + * ``LT`` for "less than" + * ``LE`` for "less than or equal" + * ``GT`` for "greater" + * ``GE`` for "greater than or equal" . + * ``EQ`` for "equal", + * ``REQ`` for "reversed equal" + Comparisons make sense if and only if the index type is TREE. - Note that we didn't use the name of the index, which means we use primary index here. This type of search may return more than one tuple; if so, the tuples will be @@ -203,7 +207,7 @@ then it also changes the index keys defined for the tuple. .. _partial_key_search: #. **Partial key search:** The search may be for some key parts starting with - the prefix of the key. Notice that partial key searches are available + the prefix of the key. Note that partial key searches are available only in TREE indexes. .. code-block:: tarantoolsession @@ -374,7 +378,7 @@ RTREE index can accept two types of ``distance`` functions: ``euclid`` and ``man .. code-block:: lua my_space = box.schema.create_space("test") - my_space:format{ { type= 'number', name='id' }, { type='array', name='content' } } + my_space:format({ { type= 'number', name='id' }, { type='array', name='content' } }) hash_index = my_space:create_index('primary', { type = 'HASH', parts = {'id'} }) rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, parts = {'content'} }) From 37738c1969ccaebe7fe50e2afda9c81637bb932c Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Thu, 11 Mar 2021 12:18:00 +0300 Subject: [PATCH 5/9] small improvements --- doc/book/box/indexes.rst | 119 ++++++++++++++++++++------------------- 1 file changed, 61 insertions(+), 58 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index 38b4b464bc..9881626780 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -4,8 +4,7 @@ Indexes ================================================================================ An **index** is a special data structure that stores a group of key values and -pointers. It is used for efficient manipulations with data -and should be chosen depending on the task. +pointers. It is used for efficient manipulations with data. As with spaces, you should specify the index **name**, and let Tarantool come up with a unique **numeric identifier** ("index id"). @@ -143,7 +142,11 @@ then it also changes the index keys defined for the tuple. .. code-block:: tarantoolsession - tarantool> box.space.tester:create_index('thrine', {parts = {{field = 2, type = 'string'}, {field=3, type='unsigned'}, {field=4, type='unsigned'}}}) + tarantool> box.space.tester:create_index('thrine', {parts = { + > {field = 2, type = 'string'}, + > {field=3, type='unsigned'}, + > {field=4, type='unsigned'} + > }}) --- - unique: true parts: @@ -164,83 +167,83 @@ then it also changes the index keys defined for the tuple. **There are the following SELECT variations:** -* The search can use **comparisons** other than equality: +* The search can use **comparisons** other than equality: - .. code-block:: tarantoolsession + .. code-block:: tarantoolsession - tarantool> box.space.tester:select(1, {iterator = 'GT'}) - --- - - - [2, 'Scorpions', 2015, 4] - - [3, 'Ace of Base', 1993] - - [4, 'Roxette', 2016, 3] - ... + tarantool> box.space.tester:select(1, {iterator = 'GT'}) + --- + - - [2, 'Scorpions', 2015, 4] + - [3, 'Ace of Base', 1993] + - [4, 'Roxette', 2016, 3] + ... - The :ref:`comparison operators ` are: - - * ``LT`` for "less than" - * ``LE`` for "less than or equal" - * ``GT`` for "greater" - * ``GE`` for "greater than or equal" . - * ``EQ`` for "equal", - * ``REQ`` for "reversed equal" - - Comparisons make sense if and only if the index type is TREE. - Note that we didn't use the name of the index, which means we use primary index here. + The :ref:`comparison operators ` are: - This type of search may return more than one tuple; if so, the tuples will be - in descending order by key when the comparison operator is LT or LE or REQ, - otherwise in ascending order. + * ``LT`` for "less than" + * ``LE`` for "less than or equal" + * ``GT`` for "greater" + * ``GE`` for "greater than or equal" . + * ``EQ`` for "equal", + * ``REQ`` for "reversed equal" -#. The search can use a **secondary index**. + Comparisons make sense if and only if the index type is TREE. + Note that we didn't use the name of the index, which means we use primary index here. - For a primary-key search, it is optional to specify an index name as - was demonstrated above. - For a secondary-key search, it is mandatory. + This type of search may return more than one tuple; if so, the tuples will be + in descending order by key when the comparison operator is LT or LE or REQ, + otherwise in ascending order. - .. code-block:: tarantoolsession +* The search can use a **secondary index**. - tarantool> box.space.tester.index.secondary:select({1993}) - --- - - - [3, 'Ace of Base', 1993] - ... + For a primary-key search, it is optional to specify an index name as + was demonstrated above. + For a secondary-key search, it is mandatory. - .. _partial_key_search: + .. code-block:: tarantoolsession -#. **Partial key search:** The search may be for some key parts starting with - the prefix of the key. Note that partial key searches are available - only in TREE indexes. + tarantool> box.space.tester.index.secondary:select({1993}) + --- + - - [3, 'Ace of Base', 1993] + ... - .. code-block:: tarantoolsession + .. _partial_key_search: - tarantool> box.space.tester.index.thrine:select({'Scorpions', 2015}) - --- - - - [2, 'Scorpions', 2015, 4] - ... +* **Partial key search:** The search may be for some key parts starting with + the prefix of the key. Note that partial key searches are available + only in TREE indexes. -#. The search can be for all fields, using a table as the value: + .. code-block:: tarantoolsession - .. code-block:: tarantoolsession + tarantool> box.space.tester.index.thrine:select({'Scorpions', 2015}) + --- + - - [2, 'Scorpions', 2015, 4] + ... - tarantool> box.space.tester.index.thrine:select({'Roxette', 2016, 3}) - --- - - - [4, 'Roxette', 2016, 3] - ... +* The search can be for all fields, using a table as the value: - or the search can be for one field, using a table or a scalar: + .. code-block:: tarantoolsession - .. code-block:: tarantoolsession + tarantool> box.space.tester.index.thrine:select({'Roxette', 2016, 3}) + --- + - - [4, 'Roxette', 2016, 3] + ... - tarantool> box.space.tester.index.thrine:select({'Roxette'}) - --- - - - [1, 'Roxette', 1986, 5] - - [4, 'Roxette', 2016, 3] - ... + or the search can be for one field, using a table or a scalar: + + .. code-block:: tarantoolsession + + tarantool> box.space.tester.index.thrine:select({'Roxette'}) + --- + - - [1, 'Roxette', 1986, 5] + - [4, 'Roxette', 2016, 3] + ... .. admonition:: Tip :class: fact - You can also add, drop, or alter the definitions at runtime, with some restrictions. - Read more about index operations in reference for + You can also add, drop, or alter the definitions at runtime, with some + restrictions. Read more about index operations in reference for :doc:`box.index submodule `. -------------------------------------------------------------------------------- From 7c778d77efa38d03f3af65f0b437750accb7914c Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Fri, 12 Mar 2021 10:44:47 +0300 Subject: [PATCH 6/9] more improvements --- doc/book/box/indexes.rst | 21 +++++++++++-------- .../reference_lua/box_space/create_index.rst | 2 +- 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index 9881626780..a26a35a792 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -62,7 +62,7 @@ and their expected **types**. See allowed indexed field types in section .. cssclass:: highlight .. parsed-literal:: - :extsamp:`box.space.{**{space-name}**}:create_index('primary', {type = 'hash', parts = {{field = 1, type = 'unsigned'}}}` + :extsamp:`box.space.{**{space-name}**}:create_index({**{index-name}**}, {type = 'tree', parts = {{field = 1, type = 'unsigned'}}}` Space definitions and index definitions are stored permanently in Tarantool's system spaces :ref:`_space ` and :ref:`_index `. @@ -106,7 +106,7 @@ then it also changes the index keys defined for the tuple. .. code-block:: tarantoolsession tarantool> my_space:create_index('primary', { - > type = 'hash', + > type = 'tree', > parts = {'id'} > }) @@ -144,8 +144,8 @@ then it also changes the index keys defined for the tuple. tarantool> box.space.tester:create_index('thrine', {parts = { > {field = 2, type = 'string'}, - > {field=3, type='unsigned'}, - > {field=4, type='unsigned'} + > {field = 3, type = 'unsigned'}, + > {field = 4, type = 'unsigned'} > }}) --- - unique: true @@ -187,7 +187,10 @@ then it also changes the index keys defined for the tuple. * ``EQ`` for "equal", * ``REQ`` for "reversed equal" - Comparisons make sense if and only if the index type is TREE. + Value comparisons make sense if and only if the index type is TREE. + The iterator types for other types of indexes are slightly different and work + differently. See details in section :ref:`Iterator types `. + Note that we didn't use the name of the index, which means we use primary index here. This type of search may return more than one tuple; if so, the tuples will be @@ -509,9 +512,9 @@ Here's short example of using 4D tree: .. code-block:: tarantoolsession - tarantool> my_space = box.schema.create_space("test") - tarantool> my_space:format{ { type= 'number', name='id' }, { type='array', name='content' } } - tarantool> hash_index = my_space:create_index('primary', { type = 'HASH', parts = {'id'} }) + tarantool> my_space = box.schema.create_space("tester") + tarantool> my_space:format{ { type = 'number', name = 'id' }, { type = 'array', name = 'content' } } + tarantool> primary_index = my_space:create_index('primary', { type = 'TREE', parts = {'id'} }) tarantool> rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, dimension = 4, parts = {'content'} }) tarantool> my_space:insert{1, {1, 2, 3, 4}} -- insert 4D point tarantool> my_space:insert{2, {1, 1, 1, 1, 2, 2, 2, 2}} -- insert 4D box @@ -539,7 +542,7 @@ Here's short example of using 4D tree: that could be tons of data with corresponding performance. And another frequent mistake is to specify iterator type without quotes, - in such way: ``rtree_index:select(rect, {iterator = LE})``. + in such way: ``rtree_index:select(rect, {iterator = 'LE'})``. This leads to silent EQ select, because ``LE`` is undefined variable and treated as nil, so iterator is unset and default used. diff --git a/doc/reference/reference_lua/box_space/create_index.rst b/doc/reference/reference_lua/box_space/create_index.rst index ab01ffeb3b..e608d390bd 100644 --- a/doc/reference/reference_lua/box_space/create_index.rst +++ b/doc/reference/reference_lua/box_space/create_index.rst @@ -11,7 +11,7 @@ On this page: * :ref:`Creating an index using field names instead of field numbers ` * :ref:`Creating an index using the path option for map fields (JSON-path indexes) ` * :ref:`Creating an index using the path option with [*] ` -* :ref:`Creating a functional index with space_object:create_index() ` +* :ref:`Creating a functional index ` .. class:: space_object From 518e8f38c78a922203b5ebebac2cca38dc7e735a Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva <37013254+Onvember@users.noreply.github.com> Date: Fri, 12 Mar 2021 17:12:40 +0300 Subject: [PATCH 7/9] Apply suggestions from code review Co-authored-by: Nick Volynkin --- doc/book/box/indexes.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index a26a35a792..377f2a0a7f 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -28,7 +28,7 @@ Creating an index -------------------------------------------------------------------------------- It is mandatory to create an index for a space before trying to insert -tuples into it, or select tuples from it. +tuples into the space, or select tuples from the space. The simple :doc:`index-creation ` operation is: @@ -191,11 +191,11 @@ then it also changes the index keys defined for the tuple. The iterator types for other types of indexes are slightly different and work differently. See details in section :ref:`Iterator types `. - Note that we didn't use the name of the index, which means we use primary index here. + Note that we don't use the name of the index, which means we use primary index here. - This type of search may return more than one tuple; if so, the tuples will be - in descending order by key when the comparison operator is LT or LE or REQ, - otherwise in ascending order. + This type of search may return more than one tuple. The tuples will be sorted + in descending order by key if the comparison operator is LT or LE or REQ. + Otherwise they will be sorted in ascending order. * The search can use a **secondary index**. From 7248bda704575bcb1c58d6c7574d7ab6b7c31ce0 Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Fri, 12 Mar 2021 18:40:52 +0300 Subject: [PATCH 8/9] add changes after review --- doc/book/box/indexes.rst | 148 +++++++++--------- .../reference_lua/box_space/create_index.rst | 60 +++---- 2 files changed, 105 insertions(+), 103 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index 377f2a0a7f..739adf0142 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -42,7 +42,8 @@ This creates a unique :ref:`TREE ` index on the first field of all tuples (often called "Field#1"), which is assumed to be numeric. A recommended design pattern for a data model is to base primary keys on the -first fields of a tuple, because this speeds up tuple comparison. +first fields of a tuple. This speeds up tuple comparison due to the specifics of +data storage and the way comparisons are arranged in Tarantool. The simple :doc:`SELECT ` request is: @@ -54,6 +55,7 @@ The simple :doc:`SELECT ` request is: This looks for a single tuple via the first index. Since the first index is always unique, the maximum number of returned tuples will be 1. You can call ``select()`` without arguments, and it will return all tuples. +Be careful! Using ``select()`` for huge spaces hangs your instance. An index definition may also include identifiers of tuple fields and their expected **types**. See allowed indexed field types in section @@ -167,80 +169,80 @@ then it also changes the index keys defined for the tuple. **There are the following SELECT variations:** -* The search can use **comparisons** other than equality: +* The search can use **comparisons** other than equality: - .. code-block:: tarantoolsession + .. code-block:: tarantoolsession - tarantool> box.space.tester:select(1, {iterator = 'GT'}) - --- - - - [2, 'Scorpions', 2015, 4] - - [3, 'Ace of Base', 1993] - - [4, 'Roxette', 2016, 3] - ... + tarantool> box.space.tester:select(1, {iterator = 'GT'}) + --- + - - [2, 'Scorpions', 2015, 4] + - [3, 'Ace of Base', 1993] + - [4, 'Roxette', 2016, 3] + ... - The :ref:`comparison operators ` are: + The :ref:`comparison operators ` are: - * ``LT`` for "less than" - * ``LE`` for "less than or equal" - * ``GT`` for "greater" - * ``GE`` for "greater than or equal" . - * ``EQ`` for "equal", - * ``REQ`` for "reversed equal" + * ``LT`` for "less than" + * ``LE`` for "less than or equal" + * ``GT`` for "greater" + * ``GE`` for "greater than or equal" . + * ``EQ`` for "equal", + * ``REQ`` for "reversed equal" - Value comparisons make sense if and only if the index type is TREE. - The iterator types for other types of indexes are slightly different and work - differently. See details in section :ref:`Iterator types `. + Value comparisons make sense if and only if the index type is TREE. + The iterator types for other types of indexes are slightly different and work + differently. See details in section :ref:`Iterator types `. - Note that we don't use the name of the index, which means we use primary index here. + Note that we don't use the name of the index, which means we use primary index here. - This type of search may return more than one tuple. The tuples will be sorted - in descending order by key if the comparison operator is LT or LE or REQ. - Otherwise they will be sorted in ascending order. + This type of search may return more than one tuple. The tuples will be sorted + in descending order by key if the comparison operator is LT or LE or REQ. + Otherwise they will be sorted in ascending order. -* The search can use a **secondary index**. +* The search can use a **secondary index**. - For a primary-key search, it is optional to specify an index name as - was demonstrated above. - For a secondary-key search, it is mandatory. + For a primary-key search, it is optional to specify an index name as + was demonstrated above. + For a secondary-key search, it is mandatory. - .. code-block:: tarantoolsession + .. code-block:: tarantoolsession - tarantool> box.space.tester.index.secondary:select({1993}) - --- - - - [3, 'Ace of Base', 1993] - ... + tarantool> box.space.tester.index.secondary:select({1993}) + --- + - - [3, 'Ace of Base', 1993] + ... - .. _partial_key_search: + .. _partial_key_search: -* **Partial key search:** The search may be for some key parts starting with - the prefix of the key. Note that partial key searches are available - only in TREE indexes. +* **Partial key search:** The search may be for some key parts starting with + the prefix of the key. Note that partial key searches are available + only in TREE indexes. - .. code-block:: tarantoolsession + .. code-block:: tarantoolsession - tarantool> box.space.tester.index.thrine:select({'Scorpions', 2015}) - --- - - - [2, 'Scorpions', 2015, 4] - ... + tarantool> box.space.tester.index.thrine:select({'Scorpions', 2015}) + --- + - - [2, 'Scorpions', 2015, 4] + ... -* The search can be for all fields, using a table as the value: +* The search can be for all fields, using a table as the value: - .. code-block:: tarantoolsession + .. code-block:: tarantoolsession - tarantool> box.space.tester.index.thrine:select({'Roxette', 2016, 3}) - --- - - - [4, 'Roxette', 2016, 3] - ... + tarantool> box.space.tester.index.thrine:select({'Roxette', 2016, 3}) + --- + - - [4, 'Roxette', 2016, 3] + ... - or the search can be for one field, using a table or a scalar: + or the search can be for one field, using a table or a scalar: - .. code-block:: tarantoolsession + .. code-block:: tarantoolsession - tarantool> box.space.tester.index.thrine:select({'Roxette'}) - --- - - - [1, 'Roxette', 1986, 5] - - [4, 'Roxette', 2016, 3] - ... + tarantool> box.space.tester.index.thrine:select({'Roxette'}) + --- + - - [1, 'Roxette', 1986, 5] + - [4, 'Roxette', 2016, 3] + ... .. admonition:: Tip :class: fact @@ -349,19 +351,19 @@ HASH is now present in Tarantool mainly because of backward compatibility. Here are some tips. Do not use HASH index: -* just if you want to -* if you think that HASH is faster with no performance metering -* if you want to iterate over the data -* for primary key -* as an only index +* just if you want to +* if you think that HASH is faster with no performance metering +* if you want to iterate over the data +* for primary key +* as an only index Use HASH index: -* if it is a secondary key -* if you 100% won't need to make it non-unique -* if you have taken measurements on your data and you see an accountable - increase in performance -* if you save every byte on tuples (HASH is a little more compact) +* if it is a secondary key +* if you 100% won't need to make it non-unique +* if you have taken measurements on your data and you see an accountable + increase in performance +* if you save every byte on tuples (HASH is a little more compact) .. _indexes-rtree: @@ -383,9 +385,9 @@ RTREE index can accept two types of ``distance`` functions: ``euclid`` and ``man .. code-block:: lua - my_space = box.schema.create_space("test") - my_space:format({ { type= 'number', name='id' }, { type='array', name='content' } }) - hash_index = my_space:create_index('primary', { type = 'HASH', parts = {'id'} }) + my_space = box.schema.create_space("tester") + my_space:format({ { type = 'number', name = 'id' }, { type = 'array', name = 'content' } }) + hash_index = my_space:create_index('primary', { type = 'tree', parts = {'id'} }) rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, parts = {'content'} }) Corresponding tuple field thus must be an array of 2 or 4 numbers. @@ -450,7 +452,7 @@ with their rectangles strictly within a specified rectangle: .. code-block:: tarantoolsession - tarantool> rtree_index:select({0, 0, 3, 3}, {iterator='lt'}) + tarantool> rtree_index:select({0, 0, 3, 3}, {iterator = 'lt'}) --- - - [1, [1, 1]] ... @@ -459,7 +461,7 @@ Iterator GE searches for tuples with a specified rectangle within their rectangl .. code-block:: tarantoolsession - tarantool> rtree_index:select({1, 1}, {iterator='ge'}) + tarantool> rtree_index:select({1, 1}, {iterator = 'ge'}) --- - - [1, [1, 1]] ... @@ -468,7 +470,7 @@ Iterator GT searches for tuples with a specified rectangle strictly within their .. code-block:: tarantoolsession - tarantool> rtree_index:select({2.1, 2.1, 2.9, 2.9}, {itearator='gt'}) + tarantool> rtree_index:select({2.1, 2.1, 2.9, 2.9}, {itearator = 'gt'}) --- - [] ... @@ -495,7 +497,7 @@ Iterator NEIGHBOR searches for all tuples and orders them by distance to the spe --- ... - tarantool> rtree_index:select({1, 1}, {iterator='neighbor', limit=5}) + tarantool> rtree_index:select({1, 1}, {iterator = 'neighbor', limit = 5}) --- - - [11, [1, 1, 2, 2]] - [12, [1, 2, 2, 3]] @@ -611,12 +613,12 @@ and bit values are entered as hexadecimal literals for easier reading. tarantool> box.schema.space.create('bitset_example') tarantool> box.space.bitset_example:create_index('primary') - tarantool> box.space.bitset_example:create_index('bitset',{unique=false,type='BITSET', parts={2,'unsigned'}}) + tarantool> box.space.bitset_example:create_index('bitset',{unique = false, type = 'BITSET', parts = {2,'unsigned'}}) tarantool> box.space.bitset_example:insert{1,1} tarantool> box.space.bitset_example:insert{2,4} tarantool> box.space.bitset_example:insert{3,7} tarantool> box.space.bitset_example:insert{4,3} - tarantool> box.space.bitset_example.index.bitset:select(2, {iterator='BITS_ANY_SET'}) + tarantool> box.space.bitset_example.index.bitset:select(2, {iterator = 'BITS_ANY_SET'}) The result will be: diff --git a/doc/reference/reference_lua/box_space/create_index.rst b/doc/reference/reference_lua/box_space/create_index.rst index e608d390bd..96f9c3bba0 100644 --- a/doc/reference/reference_lua/box_space/create_index.rst +++ b/doc/reference/reference_lua/box_space/create_index.rst @@ -310,7 +310,7 @@ then the ``parts={...}`` clause may include ``is_nullable=true`` or If ``is_nullable`` is true, then it is legal to insert ``nil`` or an equivalent such as ``msgpack.NULL``. It is also legal to insert nothing at all when using trailing nullable fields. -Within indexes, such "null values" are always treated as equal to other null +Within indexes, such null values are always treated as equal to other null values, and are always treated as less than non-null values. Nulls may appear multiple times even in a unique index. Example: @@ -375,7 +375,7 @@ specify the path string during index creation, like this: :extsamp:`parts = {{*{field-number}*}, {*{'data-type'}*}, path = {*{'path-name'}*}}` -The index type must be ``tree`` or ``hash`` and the contents of the field +The index type must be TREE or HASH and the contents of the field must always be maps with the same path. **Example 1 -- The simplest use of path:** @@ -408,7 +408,7 @@ secondary indexes must be created before tuples are inserted. .. _box_space-path_multikey: -------------------------------------------------------------------------------- -Creating a multikey index using the path option with wildcard [*] +Creating a multikey index using the path option with [*] -------------------------------------------------------------------------------- The string in a path option can contain ``[*]`` which is called @@ -453,16 +453,16 @@ The result of the select request looks like this: Some restrictions exist: -* ``[*]`` must be alone or must be at the end of a name in the path -* ``[*]`` must not appear twice in the path -* if an index has a path with ``x[*]`` then no other index can have a path with - x.component -* ``[*]`` must not appear in the path of a primary-key -* if an index has ``unique=true`` and has a path with ``[*]`` - then duplicate keys from different tuples are disallowed but duplicate keys - for the same tuple are allowed -* the field's value must have the same structure as in the path definition, - or be nil (nil is not indexed) +* ``[*]`` must be alone or must be at the end of a name in the path +* ``[*]`` must not appear twice in the path +* if an index has a path with ``x[*]`` then no other index can have a path with + x.component +* ``[*]`` must not appear in the path of a primary-key +* if an index has ``unique=true`` and has a path with ``[*]`` + then duplicate keys from different tuples are disallowed but duplicate keys + for the same tuple are allowed +* the field's value must have the same structure as in the path definition, + or be nil (nil is not indexed) .. _box_space-index_func: @@ -477,31 +477,31 @@ any other way that users want to customize the index. There are several recommendations on building functional indexes: -* The function definition must expect a tuple, which has the contents of - fields at the time a data-change request happens, and must return a tuple, - which has the contents that will actually be put in the index. +* The function definition must expect a tuple, which has the contents of + fields at the time a data-change request happens, and must return a tuple, + which has the contents that will actually be put in the index. -* The ``create_index`` definition must include specification of all key parts, - and the custom function must return a table which has the same number of key - parts with the same types. +* The ``create_index`` definition must include specification of all key parts, + and the custom function must return a table which has the same number of key + parts with the same types. -* The space must have a memtx engine. +* The space must have a memtx engine. -* The function must be persistent and deterministic - (see :ref:`Creating function with body`). +* The function must be persistent and deterministic + (see :ref:`Creating function with body`). -* The key parts must not depend on JSON paths. +* The key parts must not depend on JSON paths. -* The function must access key-part values by index, not by field name. +* The function must access key-part values by index, not by field name. -* Functional indexes must not be primary-key indexes. +* Functional indexes must not be primary-key indexes. -* Functional indexes cannot be altered and the function cannot be changed if - it is used for an index, so the only way to change them is to drop the index - and create it again. +* Functional indexes cannot be altered and the function cannot be changed if + it is used for an index, so the only way to change them is to drop the index + and create it again. -* Only :ref:`sandboxed ` functions - are suitable for functional indexes. +* Only :ref:`sandboxed ` functions + are suitable for functional indexes. **Example:** From de08ca0b22e502c0f93b642650dfa87aa626782d Mon Sep 17 00:00:00 2001 From: Natalia Ogoreltseva Date: Mon, 15 Mar 2021 12:20:40 +0300 Subject: [PATCH 9/9] more fixes --- doc/book/box/indexes.rst | 4 ++-- doc/reference/reference_lua/box_space/create_index.rst | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/book/box/indexes.rst b/doc/book/box/indexes.rst index 739adf0142..f636f3bc9e 100644 --- a/doc/book/box/indexes.rst +++ b/doc/book/box/indexes.rst @@ -73,7 +73,7 @@ system spaces :ref:`_space ` and :ref:`_index :class: fact See full information about creating indexes, such as - how to create an index using the ``path`` option, or + how to create a multikey index, an index using the ``path`` option, or how to create a functional index in our reference for :doc:`/reference/reference_lua/box_space/create_index`. @@ -101,7 +101,7 @@ then it also changes the index keys defined for the tuple. > {name = 'id', type = 'unsigned'}, > {name = 'band_name', type = 'string'}, > {name = 'year', type = 'unsigned'}, - > {name = 'rate', type = 'unsigned', is_nullable=true}}) + > {name = 'rate', type = 'unsigned', is_nullable = true}}) #. Create the **primary** index (named ``primary``): diff --git a/doc/reference/reference_lua/box_space/create_index.rst b/doc/reference/reference_lua/box_space/create_index.rst index 96f9c3bba0..251c3396e7 100644 --- a/doc/reference/reference_lua/box_space/create_index.rst +++ b/doc/reference/reference_lua/box_space/create_index.rst @@ -10,7 +10,7 @@ On this page: * :ref:`Allowing null for an indexed key ` * :ref:`Creating an index using field names instead of field numbers ` * :ref:`Creating an index using the path option for map fields (JSON-path indexes) ` -* :ref:`Creating an index using the path option with [*] ` +* :ref:`Creating an multikey index using the path option with [*] ` * :ref:`Creating a functional index ` .. class:: space_object