unique integer IDs in Google datastore

update: A good discussion on the topics mentioned in this article can be found here, please read it before using the code :-)

newbie code ahead! Use at your own risk :-)

One of the first problems I faced when trying to build an application in Google AppEngine, was the lack of something like a “unique, auto_increment” column type in the datastore. How do I maintain a unique numeric id in a way that is guarantied work even under heavy use, and concurrent requests?

Here is some code I came up with, that seems to work. I’m a python newbie, so please don’t hesitate to point out any mistakes!

What’s more, I’m just going through the Google AppEngine quirks, so I’m not aware of how to optimize the code or of any performance considerations implied by it. Once again, any comments are more than welcome!

class Idx(db.Model):
        name = db.StringProperty(required=True)
        count = db.IntegerProperty(required=True)

class Counter():
        """Unique counters for Google Datastore.
        Usage: c=Counter('hits').inc() will increase the counter 'hits' by 1 and return the new value.
        When your application is run for the first time, you should call the create(start_value) method."""
        def __init__(self, name):
                self.__name = name
                res = db.GqlQuery("SELECT * FROM Idx WHERE name = :1 LIMIT 1", self.__name).fetch(1)
                if (len(res)==0):
                        self.__status = 0 
                else:
                        self.__status = 1
                        self.__key = res[0].key()

        def create(self, start_value=0):
                """This method is NOT "thread safe". Even though some testing is done,
                the developer is responsible make sure it is only called once for each counter.
                This should not be a problem, since it sould only be used during application installation.
                """

                res = db.GqlQuery("SELECT * FROM Idx WHERE name = :1 LIMIT 1", self.__name).fetch(1)
                if (len(res)==0):
                        C = Idx(name=self.__name, count=start_value)
                        self.__key = C.put()
                        self.__status = 1
                else:
                        raise ValueError, 'Counter: '+ self.__name +' already exists'

        def get(self):
                self.__check_sanity__()
                return db.get(self.__key).count

        def inc(self):
                self.__check_sanity__()
                db.run_in_transaction(self.__inc1__)    
                return self.get()

        def __check_sanity__(self):
                if (self.__status==0):
                        raise ValueError, 'Counter: '+self.__name+' does not exist in Idx'
                else:
                        pass

        def __inc1__(self):
                obj = db.get(self.__key)
                obj.count += 1
                obj.put()

Suppose you have a Products class that looks like this

class Product(db.Model):
        Serial_ID = db.IntegerProperty(required=True)
        Name = db.TextProperty(required=True)

You should have an “installation page” that is only called once during your application installation and does something like this to create the counter Product_Serial_ID with initaial value 0.

s = Counter('Product_Serial_ID').create(0)

Calling the above code for a second time will raise an exception, but concurrent calls may have unexpected results.

Inserting a new product in the datastore:

P = Product(Serial_ID=Counter('Product_Serial_ID').inc(), Name='Product Name')
P.put()

Please note that if put() fails, the next time you try to insert the product you will get a new Product_Serial_ID. But at least you can be sure it’s unique and incremental :-)

3 Responses to unique integer IDs in Google datastore

  1. Lim Chee Aun says:

    Interesting attempt :)

    But I thought you can do something like obj.key().id() which is something like an unique ID in Datastore as explained here:
    http://code.google.com/appengine/docs/datastore/keysandentitygroups.html

    The only hassle is you have to store the data first to get its ID then re-store it again.

  2. Arik says:

    The only way this thing can fail is because of concurrency. Meaning what if two users are creating new items at the same time? Get count is called twice at the same time and they both get the same value. So they both store their entries with the same next value, because they receive the same current value.

    What do you think?

  3. Panayotis says:

    @Lim: You are right.

    @Arik: I’m looking into it.