MongoDB Profiler is a db profiling system that can help identify inefficient
or slow queries and operations.
Levels of profiles available are:
Level | Setting |
0 | Off. & No profiling |
1 | On & only includes slow operations |
2 | On & Includes all operations |
We can enable it by setting the Profile level value using the following
command in mongo shell :
"db.setProfilingLevel(1)"
By default, mongod records slow queries to its log, as defined by slowOpThresholdMs.
NOTE
Enabling database profiler puts negative impact on MongoDB’s performance.
It’s better to enable it for specific intervals & minimal on Production Servers.
We can enable profiling on a mongod basis but This setting will not propagate
across a replica set and sharded cluster.
We can view the output in the system.profile collection in mongo shell using show profile command, or using following:
db.system.profile.find( { millis : { $gt : 200 } } )
Command returns operations that took longer than 200 ms. Similarly we
can change the values as per our need.
For the purpose of development in testing, we can enable database profiling/settings for an
entire mongod instance. The profiling level will be applied to all databases.
We can't enable the profiling settings on a mongos instance. To enable the profiling in
shard clusters, we have to enable/start profiling for each mongod instance in cluster.
Query for the recent 10 entries
db.system.profile.find().limit(10).sort( { ts : 1 } ).pretty()
Collection with the slowest queries(No. Of queries)
db.system.profile.group({key: {ns: true}, initial: {count: 0}, reduce: function(obj,prev){ prev.count++;}})
Collection with the slowest queries(No. Of millis spent)
db.system.profile.group({key: {ns: true}, initial: {millis: 0}, reduce: function(obj, prev){ prev.millis += obj.millis;}})
Most recent slow query
db.system.profile.find().sort({$natural: -1}).limit(1)
Single slowest query(Right now)
db.system.profile.find().sort({millis: -1}).limit(1)
鎴戜滑閫氳繃createCollection鏉ュ垱寤轟竴涓浐瀹氶泦鍚堬紝涓攃apped閫夐」璁劇疆涓簍rue錛?/p>
>db.createCollection("cappedLogCollection",{capped:true,size:10000})
榪樺彲浠ユ寚瀹氭枃妗d釜鏁?鍔犱笂max:1000灞炴€э細(xì)
>db.createCollection("cappedLogCollection",{capped:true,size:10000,max:1000})
鍒ゆ柇闆嗗悎鏄惁涓哄浐瀹氶泦鍚?
>db.cappedLogCollection.isCapped()
濡傛灉闇€瑕佸皢宸插瓨鍦ㄧ殑闆嗗悎杞崲涓哄浐瀹氶泦鍚堝彲浠ヤ嬌鐢ㄤ互涓嬪懡浠わ細(xì)
>db.runCommand({"convertToCapped":"posts",size:10000})
浠ヤ笂浠g爜灝嗘垜浠凡瀛樺湪鐨?posts 闆嗗悎杞崲涓哄浐瀹氶泦鍚堛€?/p>
鍥哄畾闆嗗悎鏂囨。鎸夌収鎻掑叆欏哄簭鍌ㄥ瓨鐨?榛樿鎯呭喌涓嬫煡璇㈠氨鏄寜鐓ф彃鍏ラ『搴忚繑鍥炵殑,涔熷彲浠ヤ嬌鐢?natural璋冩暣榪斿洖欏哄簭銆?/p>
>db.cappedLogCollection.find().sort({$natural:-1})
鍙互鎻掑叆鍙?qiáng)鏇存?浣嗘洿鏂頒笉鑳借秴鍑篶ollection鐨勫ぇ灝?鍚﹀垯鏇存柊澶辮觸,涓嶅厑璁稿垹闄?浣嗘槸鍙互璋冪敤drop()鍒犻櫎闆嗗悎涓殑鎵€鏈夎,浣嗘槸drop鍚庨渶瑕佹樉寮忓湴閲嶅緩闆嗗悎銆?/p>
鍦?2浣嶆満瀛愪笂涓€涓猚appped collection鐨勬渶澶у€肩害涓?82.5M,64浣嶄笂鍙彈緋葷粺鏂囦歡澶у皬鐨勯檺鍒躲€?/p>
MongoDB涓仛鍚堢殑鏂規(guī)硶浣跨敤aggregate()銆?/p>
aggregate() 鏂規(guī)硶鐨勫熀鏈娉曟牸寮忓涓嬫墍紺猴細(xì)
>db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
闆嗗悎涓殑鏁版嵁濡備笅錛?/p>
{ _id: ObjectId(7df78ad8902c) title: 'MongoDB Overview', description: 'MongoDB is no sql database', by_user: 'w3cschool.cc', url: 'http://www.w3cschool.cc', tags: ['mongodb', 'database', 'NoSQL'], likes: 100 }, { _id: ObjectId(7df78ad8902d) title: 'NoSQL Overview', description: 'No sql database is very fast', by_user: 'w3cschool.cc', url: 'http://www.w3cschool.cc', tags: ['mongodb', 'database', 'NoSQL'], likes: 10 }, { _id: ObjectId(7df78ad8902e) title: 'Neo4j Overview', description: 'Neo4j is no sql database', by_user: 'Neo4j', url: 'http://www.neo4j.com', tags: ['neo4j', 'database', 'NoSQL'], likes: 750 },
鐜板湪鎴戜滑閫氳繃浠ヤ笂闆嗗悎璁$畻姣忎釜浣滆€呮墍鍐欑殑鏂囩珷鏁幫紝浣跨敤aggregate()璁$畻緇撴灉濡備笅錛?/p>
> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}]) { "result" : [ { "_id" : "w3cschool.cc", "num_tutorial" : 2 }, { "_id" : "Neo4j", "num_tutorial" : 1 } ], "ok" : 1 } >
浠ヤ笂瀹炰緥綾諱技sql璇彞錛?em style="border: 0px; margin: 0px; padding: 0px;"> select by_user, count(*) from mycol group by by_user
鍦ㄤ笂闈㈢殑渚嬪瓙涓紝鎴戜滑閫氳繃瀛楁by_user瀛楁瀵規(guī)暟鎹繘琛屽垎緇勶紝騫惰綆梑y_user瀛楁鐩稿悓鍊肩殑鎬誨拰銆?/p>
涓嬭〃灞曠ず浜?jiǎn)涓€浜涜仛鍚堢殑琛ㄨ揪寮?
琛ㄨ揪寮?/th> | 鎻忚堪 | 瀹炰緥 |
---|---|---|
$sum | 璁$畻鎬誨拰銆?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : "$likes"}}}]) |
$avg | 璁$畻騫沖潎鍊?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$avg : "$likes"}}}]) |
$min | 鑾峰彇闆嗗悎涓墍鏈夋枃妗e搴斿€煎緱鏈€灝忓€箋€?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$min : "$likes"}}}]) |
$max | 鑾峰彇闆嗗悎涓墍鏈夋枃妗e搴斿€煎緱鏈€澶у€箋€?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$max : "$likes"}}}]) |
$push | 鍦ㄧ粨鏋滄枃妗d腑鎻掑叆鍊煎埌涓€涓暟緇勪腑銆?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", url : {$push: "$url"}}}]) |
$addToSet | 鍦ㄧ粨鏋滄枃妗d腑鎻掑叆鍊煎埌涓€涓暟緇勪腑錛屼絾涓嶅垱寤哄壇鏈€?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", url : {$addToSet : "$url"}}}]) |
$first | 鏍規(guī)嵁璧勬簮鏂囨。鐨勬帓搴忚幏鍙栫涓€涓枃妗f暟鎹€?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", first_url : {$first : "$url"}}}]) |
$last | 鏍規(guī)嵁璧勬簮鏂囨。鐨勬帓搴忚幏鍙栨渶鍚庝竴涓枃妗f暟鎹?/td> | db.mycol.aggregate([{$group : {_id : "$by_user", last_url : {$last : "$url"}}}]) |
綆¢亾鍦║nix鍜孡inux涓竴鑸敤浜庡皢褰撳墠鍛戒護(hù)鐨勮緭鍑虹粨鏋滀綔涓轟笅涓€涓懡浠ょ殑鍙傛暟銆?/p>
MongoDB鐨勮仛鍚堢閬撳皢MongoDB鏂囨。鍦ㄤ竴涓閬撳鐞嗗畬姣曞悗灝嗙粨鏋滀紶閫掔粰涓嬩竴涓閬撳鐞嗐€傜閬撴搷浣滄槸鍙互閲嶅鐨勩€?/p>
琛ㄨ揪寮忥細(xì)澶勭悊杈撳叆鏂囨。騫惰緭鍑恒€傝〃杈懼紡鏄棤鐘舵€佺殑錛屽彧鑳界敤浜庤綆楀綋鍓嶈仛鍚堢閬撶殑鏂囨。錛屼笉鑳藉鐞嗗叾瀹冪殑鏂囨。銆?/p>
榪欓噷鎴戜滑浠嬬粛涓€涓嬭仛鍚堟鏋朵腑甯哥敤鐨勫嚑涓搷浣滐細(xì)
1銆?project瀹炰緥
db.article.aggregate( { $project : { title : 1 , author : 1 , }} );
榪欐牱鐨勮瘽緇撴灉涓氨鍙繕鏈塤id,tilte鍜宎uthor涓変釜瀛楁浜?jiǎn)锛岄粯璁ゆ儏鍐典笅_id瀛楁鏄鍖呭惈鐨勶紝濡傛灉瑕佹兂涓嶅寘鍚玙id璇濆彲浠ヨ繖鏍?
db.article.aggregate( { $project : { _id : 0 , title : 1 , author : 1 }});
2.$match瀹炰緥
db.articles.aggregate( [ { $match : { score : { $gt : 70, $lte : 90 } } }, { $group: { _id: null, count: { $sum: 1 } } } ] );
$match鐢ㄤ簬鑾峰彇鍒嗘暟澶т簬70灝忎簬鎴栫瓑浜?0璁板綍錛岀劧鍚庡皢絎﹀悎鏉′歡鐨勮褰曢€佸埌涓嬩竴闃舵$group綆¢亾鎿嶄綔絎﹁繘琛屽鐞嗐€?/p>
3.$skip瀹炰緥
db.article.aggregate( { $skip : 5 });
緇忚繃$skip綆¢亾鎿嶄綔絎﹀鐞嗗悗錛屽墠浜斾釜鏂囨。琚?榪囨護(hù)"鎺夈€?/p>
autoConnectRetry simply means the driver will automatically attempt to reconnect to the server(s) after unexpected disconnects. In production environments you usually want this set to true.
connectionsPerHost are the amount of physical connections a single Mongo instance (it's singleton so you usually have one per application) can establish to a mongod/mongos process. At time of writing the java driver will establish this amount of connections eventually even if the actual query throughput is low (in order words you will see the "conn" statistic in mongostat rise until it hits this number per app server).
There is no need to set this higher than 100 in most cases but this setting is one of those "test it and see" things. Do note that you will have to make sure you set this low enough so that the total amount of connections to your server do not exceed
db.serverStatus().connections.available
In production we currently have this at 40.
connectTimeout. As the name suggest number of milliseconds the driver will wait before a connection attempt is aborted. Set timeout to something long (15-30 seconds) unless there's a realistic, expected chance this will be in the way of otherwise succesful connection attempts. Normally if a connection attempt takes longer than a couple of seconds your network infrastructure isn't capable of high throughput.
maxWaitTime. Number of ms a thread will wait for a connection to become available on the connection pool, and raises an exception if this does not happen in time. Keep default.
socketTimeout. Standard socket timeout value. Set to 60 seconds (60000).
threadsAllowedToBlockForConnectionMultiplier. Multiplier for connectionsPerHost that denotes the number of threads that are allowed to wait for connections to become available if the pool is currently exhausted. This is the setting that will cause the "com.mongodb.DBPortPool$SemaphoresOut: Out of semaphores to get db connection" exception. It will throw this exception once this thread queue exceeds the threadsAllowedToBlockForConnectionMultiplier value. For example, if the connectionsPerHost is 10 and this value is 5 up to 50 threads can block before the aforementioned exception is thrown.
If you expect big peaks in throughput that could cause large queues temporarily increase this value. We have it at 1500 at the moment for exactly that reason. If your query load consistently outpaces the server you should just improve your hardware/scaling situation accordingly.
readPreference. (UPDATED, 2.8+) Used to determine the default read preference and replaces "slaveOk". Set up a ReadPreference through one of the class factory method. A full description of the most common settings can be found at the end of this post
w. (UPDATED, 2.6+) This value determines the "safety" of the write. When this value is -1 the write will not report any errors regardless of network or database errors. WriteConcern.NONE is the appropriate predefined WriteConcern for this. If w is 0 then network errors will make the write fail but mongo errors will not. This is typically referred to as "fire and forget" writes and should be used when performance is more important than consistency and durability. Use WriteConcern.NORMAL for this mode.
If you set w to 1 or higher the write is considered safe. Safe writes perform the write and follow it up by a request to the server to make sure the write succeeded or retrieve an error value if it did not (in other words, it sends a getLastError() command after you write). Note that until this getLastError() command is completed the connection is reserved. As a result of that and the additional command the throughput will be signficantly lower than writes with w <= 0. With a w value of exactly 1 MongoDB guarantees the write succeeded (or verifiably failed) on the instance you sent the write to.
In the case of replica sets you can use higher values for w whcih tell MongoDB to send the write to at least "w" members of the replica set before returning (or more accurately, wait for the replication of your write to "w" members). You can also set w to the string "majority" which tells MongoDB to perform the write to the majority of replica set members (WriteConcern.MAJORITY). Typicall you should set this to 1 unless you need raw performance (-1 or 0) or replicated writes (>1). Values higher than 1 have a considerable impact on write throughput.
fsync. Durability option that forces mongo to flush to disk after each write when enabled. I've never had any durability issues related to a write backlog so we have this on false (the default) in production.
j *(NEW 2.7+)*. Boolean that when set to true forces MongoDB to wait for a successful journaling group commit before returning. If you have journaling enabled you can enable this for additional durability. Refer to http://www.mongodb.org/display/DOCS/Journaling to see what journaling gets you (and thus why you might want to enable this flag).
ReadPreference The ReadPreference class allows you to configure to what mongod instances queries are routed if you are working with replica sets. The following options are available :
ReadPreference.primary() : All reads go to the repset primary member only. Use this if you require all queries to return consistent (the most recently written) data. This is the default.
ReadPreference.primaryPreferred() : All reads go to the repset primary member if possible but may query secondary members if the primary node is not available. As such if the primary becomes unavailable reads become eventually consistent, but only if the primary is unavailable.
ReadPreference.secondary() : All reads go to secondary repset members and the primary member is used for writes only. Use this only if you can live with eventually consistent reads. Additional repset members can be used to scale up read performance although there are limits to the amount of (voting) members a repset can have.
ReadPreference.secondaryPreferred() : All reads go to secondary repset members if any of them are available. The primary member is used exclusively for writes unless all secondary members become unavailable. Other than the fallback to the primary member for reads this is the same as ReadPreference.secondary().
ReadPreference.nearest() : Reads go to the nearest repset member available to the database client. Use only if eventually consistent reads are acceptable. The nearest member is the member with the lowest latency between the client and the various repset members. Since busy members will eventually have higher latencies this should also automatically balance read load although in my experience secondary(Preferred) seems to do so better if member latencies are relatively consistent.
Note : All of the above have tag enabled versions of the same method which return TaggableReadPreference instances instead. A full description of replica set tags can be found here :Replica Set Tags