-
Notifications
You must be signed in to change notification settings - Fork 11
/
Copy pathweek-5.txt
226 lines (171 loc) · 9.34 KB
/
week-5.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
SHARDED CLUSTER MANAGEMENT
Quiz:
1. Which of the following are essential questions to answer in planning a strategy for scaling out with shards? Check all that apply.
- What tolerance do I have for data loss?
- By what factors should I measure the capacity of my system? [T]
- How many shards do I need to add to keep my capacity under 80%? [T]
- What is the age of my existing infrastructure and should I consider updating to more reliable hardware?
- What growth trend does my application follow along my primary capacity metric? [T]
2. Which of the following accurately describe config servers in a sharded cluster? Check all that apply.
- You can consider config servers to be identical with regard to the data they contain. [T]
- The configure servers form a replica set.
- Requests from mongos processes are distributed across the config servers.
- Config servers are critical to the health of your sharded cluster. [T]
- Config servers provide automatic failover such that if one goes down, meta-data updates can continue.
3. True or false? When upgrading from MongoDB 2.4 to 2.6 there is a strict order in which you must upgrade the mongod, mongos, and config servers.
- True. [T]
- False.
4. Which of the following are services a mongos performs? Check all that apply.
- Coordinates the process of keeping shards balanced [T]
- Provides an abstraction layer that frees clients from a need to know how the data is sharded [T]
- Maintains all cluster meta data
- Merges results from individual shards for queries involving sort() [T]
- Maintains indexes for shard keys
5. Which of the following occur when a chunk is split?
- A small split token is placed in the data file
- The data file is turned into two files, but they are at the same location as they were before
- A split alters meta data only, there is no change to the data itself [T]
- The data file is split in two and possibly moved
- A new chunk is created and new writes go to the new chunk
6. Which of the following is the most likely cause of MongoDB marking a chunk as jumbo?
- The balancer is not running
- Too many large documents in your collection
- Too few shards
- A shard key with a cardinality that is too low [T]
- Too many documents in the collection
===============================================================================================================
H-w 5-1:
1. Start configdb server(s):
mongod --configsvr --dbpath /data/configdb --port 27019
2. Start a mongos instance:
mongos --configdb HP:27019 --port 27099
3. Start 3 mongod instances:
mongod --port 27031 --dbpath=/data/db
mongod --port 27032 --dbpath=/data/db1
mongod --port 27033 --dbpath=/data/db2
4. You should have 4 mongo processes running on your machine.
5. Now we need to set the 3 mongod instaces(3 shards) as a cluster, go to the mongos shell:
mongo --port 27099
mongos> db.adminCommand({addShard: "HP:27031", "name":"shard0"})
mongos> db.adminCommand({addShard: "HP:27032", "name":"shard1"})
mongos> db.adminCommand({addShard: "HP:27033", "name":"shard2"})
6. Before doing anything to cluster, we should stop the balancer to avoid automatic chunk migrations:
mongos> sh.stopBalancer()
mongos> sh.getBalancerState() //should say false here
7. Now we need to create a db m202 with presplit collection in it:
mongos> use m202
mongos> db.createCollection("presplit")
8. Now we need to enable sharding on m202 db and shard the Collections using the "a" as shard key:
mongos> sh.enableSharding("m202")
mongos> sh.shardCollection("m202.presplit",{"a":1})
9. Now you need to create chunks:
mongos> for ( var x=0; x<25; x++ ){ db.adminCommand({ split : "m202.presplit" , middle : { a : x } })}
Above command will create 26 chunks on shard0 as its primary.
10. sh.status() will show you the chunks under the m202 database.
11. Inoder to move the chunks from one shard to other shard, You can use the following command:
db.adminCommand( { moveChunk : "m202.presplit", find : {a:14}, to : "HP:27032"})
I've moved the chunks first according to the question and we need to merge these chunks here according to the ranges.
12. For merging:
db.adminCommand( { mergeChunks: "m202.presplit", bounds: [ { "a":0},{"a":2} ] })
Above command will merge the a:0 --> a:1 and a:1 --> a:2 chunks.
So, use the above command accordingly to get the ranges as per the question and sometimes you may need to move the chunks again.
mongos> sh.status(true)
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("53899ab5505ac666aae1e316")
}
shards:
{ "_id" : "shard0", "host" : "HP:27031" }
{ "_id" : "shard1", "host" : "HP:27032" }
{ "_id" : "shard2", "host" : "HP:27033" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "m202", "partitioned" : true, "primary" : "shard0" }
m202.presplit
shard key: { "a" : 1 }
chunks:
shard0 4
shard1 3
shard2 4
{ "a" : { "$minKey" : 1 } } -->> { "a" : 0 } on : shard0 Timestamp(18, 1)
{ "a" : 0 } -->> { "a" : 7 } on : shard0 Timestamp(20, 7)
{ "a" : 7 } -->> { "a" : 10 } on : shard0 Timestamp(20, 9)
{ "a" : 10 } -->> { "a" : 14 } on : shard0 Timestamp(20, 12)
{ "a" : 14 } -->> { "a" : 15 } on : shard1 Timestamp(20, 1)
{ "a" : 15 } -->> { "a" : 20 } on : shard1 Timestamp(20, 16)
{ "a" : 20 } -->> { "a" : 21 } on : shard1 Timestamp(15, 0)
{ "a" : 21 } -->> { "a" : 22 } on : shard2 Timestamp(20, 0)
{ "a" : 22 } -->> { "a" : 23 } on : shard2 Timestamp(17, 0)
{ "a" : 23 } -->> { "a" : 24 } on : shard2 Timestamp(18, 0)
{ "a" : 24 } -->> { "a" : { "$maxKey" : 1 } } on : shard2 Timestamp(19, 0)
mongos>
mongos> cluster = new ShardingTest({"shards" : 3, "chunksize" : 1, "config": 3, other:{rs:true} }) can be used to create a shard cluster with 3replica nodes as 3 shards with 3 configdbs.
===============================================================================================================
H-w 5.2:
Which of the following are advantages of pre-splitting the data that is being loaded into a sharded cluster, rather than throwing all of the data in and waiting for it to migrate?
- You can decide which shard has which data range initially if you pre-split the data
- Migration takes time, especially when the system is under load
===============================================================================================================
H-w 5.3:
$ mongo --nodb
Now, we will spin up a sharded cluster with 2 shards, each of which has 3 members (a primary, a secondary, and an arbiter) plus one config server. We will do this using ShardingTest as follows.
> cluster = new ShardingTest({shards: 2, chunksize: 1, rs : {nodes : [{}, {}, {arbiter: true}]} });
Give that a minute to spin up and you should see all of your mongod's plus a mongos at port 30999. At this point, you can initialize MongoProc and begin your homework.
Start by initializing with MongoProc. Once you have initialized, your "testDB.testColl" collection will be populated (it will be dropped at the beginning of the initialization if present), and it will be split into chunks, and those chunks will be distributed across each of your shards.
- mongo --port 30999
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("538b7b7d9f91188f8156c2b0")
}
shards:
{ "_id" : "test-rs0", "host" : "test-rs0/HP:31100,HP:31101" }
{ "_id" : "test-rs1", "host" : "test-rs1/HP:31200,HP:31201" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "testDB", "partitioned" : true, "primary" : "test-rs0" }
testDB.testColl
shard key: { "_id" : 1 }
chunks:
test-rs1 12
test-rs0 9
too many chunks to print, use verbose if you want to force print
mongos> show dbs
admin (empty)
config 0.016GB
testDB 0.063GB
mongos> use testDB
switched to db testDB
mongos> show collections
system.indexes
testColl
mongos> db.testColl.find({"_id": 90001})
{ "_id" : 90001, "otherID" : ObjectId("5389f8fa9a85db1c5827f5dc"), "string" : "testStringForPadding0000000000000000000000000000000000000000" }
mongos>
mongos>
mongos> use admin
switched to db admin
mongos> db.adminCommand({flushRouterConfig:1})
{ "flushed" : true, "ok" : 1 }
mongos> db.testColl.find({"_id": 90001})
Done.
===============================================================================================================
H-w 5.4:
In a sharded cluster, which of the following can keep large chunks from being split as part of MongoDB's balancing process? Check all that apply.
- Frequent restarts of mongos processes
- If there are not enough distinct shard key values
- If one of the config servers is down when a mongos tries to do a split
===============================================================================================================
H-w 5.5:
Which of the following are REQUIRED when upgrading a sharded cluster to MongoDB 2.6? Check all that apply.
- If your MongoDB deployment is not already running MongoDB 2.4, upgrade to 2.4 first.
- Upgrade all mongos instances before upgrading mongod instances.
- Disable the balancer