[GMG-Devel] Help extract documentation from my brain

Christopher Allan Webber cwebber at dustycloud.org
Sun Jul 17 16:37:07 EDT 2011


More database stuff!  This time it's indexing.

                               Indexing
                               ========

Some of the following is extracted straight from the
mediagoblin/db/indexes.py

1 Running latest updates / deprecation of indexes 
--------------------------------------------------

./bin/gmg migrate

Yes, this is the same as the migration command.

2 For developers 
-----------------

2.1 Overview 
=============

Quick summary on indexes generally:
 - Basically, indexes make querying fast.  MongoDB doesn't auto-create
   indexes though, we have to specify them.
 - Core things we're working on require indexes.  Querying on multiple
   keys at once requires a multi-key index... MongoDB lacks an
   algorithm to combine multiple single-key indexes currently
 - The ordering of keys in multi-key indexes matter.
 - Adding new queries, or adding new fields, etc... maybe discuss
   whether or not an index is appropriate!  New indexes do have a
   performance and memory penalty, but not using indexes means a query
   slowness penalty.

For those touching indexes, you should read:
 - [http://kylebanker.com/blog/2010/09/21/the-joy-of-mongodb-indexes/]
 - [http://www.mongodb.org/display/DOCS/Indexes]
 - [http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ]


2.2 To add new indexes 
=======================

Indexes are recorded in the following format:

ACTIVE_INDEXES = {
    'collection_name': {
        'identifier': {  # key identifier used for possibly deprecating
    later
            'index': [index_foo_goes_here]}}

... and anything else being parameters to the create_index function
(including unique=True, etc)

Current indexes must be registered in ACTIVE_INDEXES... deprecated
indexes should be marked in DEPRECATED_INDEXES.

Remember, ordering of compound indexes MATTERS.


2.3 To remove deprecated indexes 
=================================

Removing deprecated indexes is the same, just move the index into the
deprecated indexes mapping.

DEPRECATED_INDEXES = {
    'collection_name': {
        'deprecated_index_identifier1': {
            'index': [index_foo_goes_here]}}
        
... etc.

If an index has been deprecated that identifier should NEVER BE USED
AGAIN.  Eg, if you previously had 'awesomepants_unique', you shouldn't
use 'awesomepants_unique' again, you should create a totally new name
or at worst use 'awesomepants_unique2'.

The reason for this is because the index name is how we track whether
or not the index is installed.  Using the same name makes this
difficult.  So just use a new name!


More information about the devel mailing list