てきとうなさいと べぇたばん

メモ MongoDBを触る

MongoDB

インストール

$ sudo aptitude install mongodb

インデックスを調べる

> db.numbers.getIndexes();
[
        {
                "v" : 1,
                "key" : {
                        "_id" : 1
                },
                "ns" : "test.numbers",
                "name" : "_id_"
        },
        {
                "v" : 1,
                "key" : {
                        "num" : 1
                },
                "ns" : "test.numbers",
                "name" : "num_1"
        }
]

statsメソッド

> db.stats()
{
        "db" : "test",
        "collections" : 3,
        "objects" : 2000006,
        "avgObjSize" : 36.000051999844,
        "dataSize" : 72000320,
        "storageSize" : 121733120,
        "numExtents" : 15,
        "indexes" : 2,
        "indexSize" : 115232544,
        "fileSize" : 1006632960,
        "nsSizeMB" : 16,
        "dataFileVersion" : {
                "major" : 4,
                "minor" : 5
        },
        "ok" : 1
}

statsを、以下のようにしても同じ

> db.runCommand({dbstats:1});
{
        "db" : "test",
        "collections" : 3,
        "objects" : 2000006,
        "avgObjSize" : 36.000051999844,
        "dataSize" : 72000320,
        "storageSize" : 121733120,
        "numExtents" : 15,
        "indexes" : 2,
        "indexSize" : 115232544,
        "fileSize" : 1006632960,
        "nsSizeMB" : 16,
        "dataFileVersion" : {
                "major" : 4,
                "minor" : 5
        },
        "ok" : 1
}

> db.runCommand
function ( obj ){
    if ( typeof( obj ) == "string" ){
        var n = {};
        n[obj] = 1;
        obj = n;
    }
    return this.getCollection( "$cmd" ).findOne( obj );
}

要は、以下と同じ

> db.$cmd.findOne({collstats:'numbers'});
{
        "ns" : "test.numbers",
        "count" : 2000000,
        "size" : 72000040,
        "avgObjSize" : 36.00002,
        "storageSize" : 121724928,
        "numExtents" : 13,
        "nindexes" : 2,
        "lastExtentSize" : 34312192,
        "paddingFactor" : 1,
        "systemFlags" : 1,
        "userFlags" : 0,
        "totalIndexSize" : 115232544,
        "indexSizes" : {
                "_id_" : 64909264,
                "num_1" : 50323280
        },
        "ok" : 1
}

saveメソッドを見てみる。

> db.numbers.save
function ( obj ){
    if ( obj == null || typeof( obj ) == "undefined" )
        throw "can't save a null";

    if ( typeof( obj ) == "number" || typeof( obj) == "string" )
        throw "can't save a number or string"

    if ( typeof( obj._id ) == "undefined" ){
        obj._id = new ObjectId();
        return this.insert( obj );
    }
    else {
        return this.update( { _id : obj._id } , obj , true );
    }
}

つまり、insertとupdateのラッパーであることが分かる。

Ruby

$ sudo aptitude install ruby
$ sudo gem install mongo

$ ruby connect.rb
      ** Notice: The native BSON extension was not loaded. **

      For optimal performance, use of the BSON extension is recommended.

      To enable the extension make sure ENV['BSON_EXT_DISABLED'] is not set
      and run the following command:

        gem install bson_ext

      If you continue to receive this message after installing, make sure that
      the bson_ext gem is in your load path.

bson_extがないと言われた

$ sudo gem install bson_ext
Fetching: bson_ext-1.10.2.gem (100%)
Building native extensions.  This could take a while...
ERROR:  Error installing bson_ext:
        ERROR: Failed to build gem native extension.

        /usr/bin/ruby1.9.1 extconf.rb
/usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require': cannot load such file -- mkmf (LoadError)
        from /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
        from extconf.rb:1:in `<main>'


Gem files will remain installed in /var/lib/gems/1.9.1/gems/bson_ext-1.10.2 for inspection.
Results logged to /var/lib/gems/1.9.1/gems/bson_ext-1.10.2/ext/cbson/gem_make.out
tekitoh@tekitoh-ubuntu:~/ruby$ ruby extconf.rb
ruby: No such file or directory -- extconf.rb (LoadError)

mkmfがないと言われた

$ sudo aptitude install ruby1.9.1-dev

やりなおし

$ sudo gem install bson_ext
Building native extensions.  This could take a while...
Successfully installed bson_ext-1.10.2
1 gem installed
Installing ri documentation for bson_ext-1.10.2...
Installing RDoc documentation for bson_ext-1.10.2...

はいりました。

$ ruby connect.rb

実行したら、何も出なかったので成功。

irbを使ってみる

irb(main):008:0> load 'connect.rb'
=> true
irb(main):009:0> id = @users.save({"lastname" => "knuth"})
=> BSON::ObjectId('53ff08ce1f303b0865000001')
irb(main):010:0> @users.find_one({"_id" => id})
=> {"_id"=>BSON::ObjectId('53ff08ce1f303b0865000001'), "lastname"=>"knuth"}

irb(main):011:0> smith = {"last_name" => "smith", "age" => 30}
=> {"last_name"=>"smith", "age"=>30}
irb(main):012:0> jones = {"last_name" => "jones", "age" => 40}
=> {"last_name"=>"jones", "age"=>40}

irb(main):015:0> p @users.find_one({"_id" => smith_id})
{"_id"=>BSON::ObjectId('53ff092a1f303b0865000002'), "last_name"=>"smith", "age"=>30}
=> {"_id"=>BSON::ObjectId('53ff092a1f303b0865000002'), "last_name"=>"smith", "age"=>30}


irb(main):021:0> @users.find({"age" => {"$gt" => 20}})
=> <Mongo::Cursor:0xd71568 namespace='tutorial.users' @selector={"age"=>{"$gt"=>20}} @cursor_id=>
irb(main):022:0> @users.find({"age" => {"$gt" => 35}})
=> <Mongo::Cursor:0xd723a0 namespace='tutorial.users' @selector={"age"=>{"$gt"=>35}} @cursor_id=>

Mongo::Cursorが返る。ということはフェッチもできる

irb(main):023:0> cursor = @users.find({"age" => {"$gt" => 20}})
=> <Mongo::Cursor:0xd76a7c namespace='tutorial.users' @selector={"age"=>{"$gt"=>20}} @cursor_id=>
irb(main):024:0> cursor.each do |doc|
irb(main):025:1*   puts doc['last_name']
irb(main):026:1> end
smith
jones
=> nil

Mongo::Cursorが返ってくる。いきなり結果のarrayでもいいのではと思うかもしれないが、例えば100万件のデータを取得したらすごいリソースを消費してしまう。それならば、フェッチしたほうがリソースの消費は少ない。

データベースコマンドを送ることもできる。

irb(main):018:0> @admin_db = @con['admin']
=> #<Mongo::DB:0x000000019caf08 @name="admin", @client=#<Mongo::Connection:0x000000010cea58 @host="localhost", @port=27017, @id_lock=#<Mutex:0x000000010ce8a0>, @primary=["localhost", 27017], @primary_pool=#<Mongo::Pool:0x935238 @host=localhost @port=27017 @ping_time= 0/1 sockets available up=true>, @mongos=false, @tag_sets=[], @acceptable_latency=15, @max_bson_size=16777216, @max_message_size=48000000, @max_wire_version=nil, @min_wire_version=nil, @max_write_batch_size=nil, @slave_ok=nil, @ssl=nil, @unix=false, @socket_opts={}, @socket_class=Mongo::TCPSocket, @db_name=nil, @auths=#<Set: {}>, @pool_size=1, @pool_timeout=5.0, @op_timeout=nil, @connect_timeout=30, @logger=nil, @read=:primary, @write_concern={:w=>1}, @read_primary=true>, @strict=nil, @pk_factory=nil, @write_concern={:w=>1}, @read=:primary, @tag_sets=[], @acceptable_latency=15, @cache_time=300>

listDatabasesコマンドを送ると、ハッシュで返ってくる

irb(main):019:0> @admin_db.command({"listDatabases" => 1})
=> {"databases"=>[{"name"=>"local", "sizeOnDisk"=>83886080.0, "empty"=>false}, {"name"=>"test", "sizeOnDisk"=>1023410176.0, "empty"=>false}, {"name"=>"tutorial", "sizeOnDisk"=>218103808.0, "empty"=>false}], "totalSize"=>1325400064.0, "ok"=>1.0}

MongoのオブジェクトID

53ff092a1f303b0865000002
-------+-----+---+------
   a     b     c    d
  • aは4バイトのタイムスタンプ
  • bはマシンのID
  • cはプロセスのID
  • dはカウンタ

タイムスタンプから期間の範囲指定のクエリを発行できる。

irb(main):011:0> oct_id = BSON::ObjectId.from_time(Time.utc(2014, 8, 1))
=> BSON::ObjectId('53dad8800000000000000000')
irb(main):012:0> nov_id = BSON::ObjectId.from_time(Time.utc(2014, 10, 1))
=> BSON::ObjectId('542b44000000000000000000')
irb(main):013:0> @users.find({'_id' => {'$gte' => oct_id, '$lt' => nov_id}})
=> <Mongo::Cursor:0xbf2750 namespace='tutorial.users' @selector={"_id"=>{"$gte"=>BSON::ObjectId('53dad8800000000000000000'), "$lt"=>BSON::ObjectId('542b44000000000000000000')}} @cursor_id=>
irb(main):014:0> @users.find({'_id' => {'$gte' => oct_id, '$lt' => nov_id}}).each do |doc|
irb(main):015:1*   p doc
irb(main):016:1> end
{"_id"=>BSON::ObjectId('53ff08ce1f303b0865000001'), "lastname"=>"knuth"}
{"_id"=>BSON::ObjectId('53ff092a1f303b0865000002'), "age"=>30, "city"=>"Chicago", "last_name"=>"smith"}
{"_id"=>BSON::ObjectId('53ff09331f303b0865000003'), "last_name"=>"jones", "age"=>40}
=> nil

サンプルアプリケーションの構築

$ sudo gem install twitter
$ sudo gem install sinatra

色々変わってたので、公式サイト見たりして修正。

インデックス・Explain

> db.numbers.find({num:{$gt:114, $lt: 220}}).explain()
{
        "cursor" : "BasicCursor",
        "isMultiKey" : false,
        "n" : 105,
        "nscannedObjects" : 200000,
        "nscanned" : 200000,
        "nscannedObjectsAllPlans" : 200000,
        "nscannedAllPlans" : 200000,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 89,
        "indexBounds" : {

        },
        "server" : "tekitoh-ubuntu:27017"
}

cursorを見ると、BasicCursorとなっている。1行目から最後までスキャンするため恐ろしく遅い。 インデックスを貼ろう。

> db.numbers.ensureIndex({num:1});
> db.numbers.getIndexes()
[
        {
                "v" : 1,
                "key" : {
                        "_id" : 1
                },
                "ns" : "tutorial.numbers",
                "name" : "_id_"
        },
        {
                "v" : 1,
                "key" : {
                        "num" : 1
                },
                "ns" : "tutorial.numbers",
                "name" : "num_1"
        }
]

インデックスを貼れた。

> db.numbers.find({num:{$gt:114, $lt: 220}}).explain()
{
        "cursor" : "BtreeCursor num_1",
        "isMultiKey" : false,
        "n" : 105,
        "nscannedObjects" : 105,
        "nscanned" : 105,
        "nscannedObjectsAllPlans" : 105,
        "nscannedAllPlans" : 105,
        "scanAndOrder" : false,
        "indexOnly" : false,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "millis" : 0,
        "indexBounds" : {
                "num" : [
                        [
                                114,
                                220
                        ]
                ]
        },
        "server" : "tekitoh-ubuntu:27017"
}

cursorを見ると、BtreeCursor num_1となっている。millisを見ても、0msとなっているので、インデックスを使用しているということになる。

さんこう

MongoDB イン・アクション