M220JS study notes

15 Wednesday May 2019

Tags

MongoDB offers a free MOOC, M220JS for Javascript developers. This page aims to keep study notes through the course. Used code snippets are taken from course repository. M220js aims to provide introduction to MongoDB with usage from Node.js application as backend. Through preparing database connection of application;

Creation and sharing of database connections
Writing data with different levels of durability
Handling errors from driver

topics are investigated. Application uses Nodejs, Express, MongoDB and React.

Through Atlas, create a new project M220 and free tier cluster of mflix

m220js-atlas-cluster

mflix-js application provides a skeleton with change requirements mostly in database access objects (src/dao)

src/dao/usersDAO.js
src/dao/moviesDAO.js
src/dao/commentsDAO.js

m220js-project-folder-structure

Used in environment;

m220js-system-requirements

npm packet manager will be used in development cycle

npm install
mpm start
npm test -t mongoclient

The application api layer uses port 5000 by default

m220js-testing-app-start

m220js-running-unit-tests

uploading data to cluster through mongorestore

sifa@sifa:~/MongoDb/m220js/mflix-js$ mongorestore --drop --gzip --uri "mongodb+srv://:@m.mongodb.net/test?retryWrites=true" data

m220js-findone-after-restoring-data

Callbacks, promises, asynchronous waits

This section is devoted to how driver is used in asynchronous programming. A good introduction may be helpful for js beginners like me. The relevant test is test/lessons/callbacks-promises-async.spec.js.

findOne(query, options, callback) -> {Promise}

When there is no provided callback, a promise is returned. Await statements should be wrapped with a try/catch block. And to use await, the enclosing function should be async.

callback:

movies.findOne({ title: "Once Upon a Time in Mexico" }, function(err, doc) {
  expect(err).toBeNull()
  expect(doc.title).toBe("Once Upon a Time in Mexico")
  expect(doc.cast).toContain("Salma Hayek")
  done()
})

promise:

movies
  .findOne({ title: "Once Upon a Time in Mexico" })
  .then(doc => {
    expect(doc.title).toBe("Once Upon a Time in Mexico")
    expect(doc.cast).toContain("Salma Hayek")
    done()
  })
  .catch(err => {
    expect(err).toBeNull()
    done()
  })

asynchronous wait:

try {
  let { title } = await movies.findOne({
    title: "Once Upon a Time in Mexico",
  })
  let { cast } = await movies.findOne({
    title: "Once Upon a Time in Mexico",
  })
  expect(title).toBe("Once Upon a Time in Mexico")
  expect(cast).toContain("Salma Hayek")
} catch (e) {
  expect(e).toBeNull()
}

Basic Reads

findOne() method returns a single document as result. It involves querying a unique index _id field.

let result = await movies.findOne({ cast: filter })
expect(result).not.toBeNull()

Projection usage is a little bit different than the mongo shell as, we have to specify projection as json object {projection: {}}. _id field is on by default so it should be explicitly removed from projection if not wanted.

let result2 = await movies.findOne(
  { cast: filter },
  { projection: { title: 1, year: 1, _id: 0 } },
)
expect(result).not.toBeNull(

find() method returns an iterator, which should be iterated further by next().

let result = await movies.find({
cast: { $all: ["Salma Hayek", "Johnny Depp"] },
})
expect(result).not.toBeNull()
let { title, year, cast } = await result.next()
expect(title).toBe("Once Upon a Time in Mexico")

To make a query about countries and return title, a simple solution will be:

cursor = await movies.find({countries: {$in: countries}}, { projection: {title: 1 }})

m220js-running-suite-of-integration-tests

Chapter #2 User-Facing Backend

Cursor methods and aggregation equivalents

We can limit the number of documents that cursor will iterate with

const limitedCursor = movies
  .find({ directors: "Sam Raimi" }, { _id: 0, title: 1, cast: 1 })
  .limit(2)

expect((await limitedCursor.toArray()).length).toEqual(2)

Its aggregation equivalent will be as:

const limitPipeline = [
  { $match: { directors: "Sam Raimi" } },
  { $project: { _id: 0, title: 1, cast: 1 } },
  { $limit: 2 },
]

const limitedAggregation = await movies.aggregate(limitPipeline)

expect((await limitedAggregation.toArray()).length).toEqual(2)

Sorting documents (in ascending order)

const sortedCursor = movies
  .find({ directors: "Sam Raimi" }, { _id: 0, year: 1, title: 1, cast: 1 })
  .sort([["year", 1]])

const movieArray = await sortedCursor.toArray()

const sortPipeline = [
  { $match: { directors: "Sam Raimi" } },
  { $project: { _id: 0, year: 1, title: 1, cast: 1 } },
  { $sort: { year: 1 } },
]

const sortAggregation = await movies.aggregate(sortPipeline)
const movieArray = await sortAggregation.toArray()

skipping documents

skipping documents makes sense when the query is sorted. Otherwise it wont make sense. For example, if we want to skip the first five oldest movies;

const skippedCursor = movies
  .find({ directors: "Sam Raimi" }, { _id: 0, year: 1, title: 1, cast: 1 })
  .sort([["year", 1]])
  .skip(5)

const skippedPipeline = [
  { $match: { directors: "Sam Raimi" } },
  { $project: { _id: 0, year: 1, title: 1, cast: 1 } },
  { $sort: { year: 1 } },
  { $skip: 5 },
]

Sorting, skipping and limiting functionalities can be aggregated to form complex queries like paging. For example, if it is desired to page data;

let { query = {}, project = {}, sort = DEFAULT_SORT } = queryParams

let cursor = await movies.find(query).project(project).sort(sort)

let itemsThatShouldBeSkipped = page*moviesPerPage
const displayCursor = cursor.skip(itemsThatShouldBeSkipped).limit(moviesPerPage)

m220js-chapter2-paging

Aggregation

Aggregation is a pipeline that are composed of one or more stages, each stage use one or more expressions and expressions are functions responsible from a basic work in transforming data. {“$add”: [“$a”, “$b”]}

An example pipeline will be $match, $project, $group pipeline

$match: {directors: “Sam Raimi”}

$project: {_id: 0, title: 1, rating: 1}

$group: {_id: 0, avg_rating: {“$avg”: “imdb.rating”}}

Basic write operations

inserting documents may be done with insertOne and insertMany. trying to dublicate _id will fail. As an alternate to inserting, we can specify an upsert in an update operation

inserting one document:

let insertResult = await videoGames.insertOne({
  title: "Fortnite",
  year: 2018,
})
let { n, ok } = insertResult.result
expect({ n, ok }).toEqual({ n: 1, ok: 1 })
expect(insertResult.insertedCount).toBe(1)
expect(insertResult.insertedId).not.toBeUndefined()

updateOne() may also be used with {upsert: true} specified in options. Then we will look if document is already present update so, otherwise insert a brand new document

let upsertResult = await videoGames.updateOne(
  // this is the "query" portion of the update
  { title: "Call of Duty" },
  // this is the update
  {
    $set: {
      title: "Call of Duty",
      year: 2003,
    },
  },
  // this is the options document. We've specified upsert: true, so if the
  // query doesn't find a document to update, it will be written instead as
  // a new document
  { upsert: true },
)

m220js-chapter2-usermanagement

Write Concerns

Write concerns is about how much to be sure a write request is propagated to all nodes in cluster. Default writeConcern: {w: 1} only requests an acknowledgement that one node applied the write.

If we want primary wait for majority of nodes to replicate the data before providing an acknowledgement, we can use {w: majority}. Notice that it takes more time than {w: 1} due to replication lag but mode durable to ensure vital writes are majority committed.

There is also a fire & forget {w: 0} that does not request an acknowledgement, which is fastest but obviously least durable. An example may be an IoT device sending non vital frequent data, if loosing some wont be a problem.

In a 3 node MongoDB replica set, valid writeConcerns will be:

{w: 0}
{w: 1}
{w: majority}

m220js-chapter2-durablewrites

Basic Updates

Two operations for updating are updateOne() and updateMany(). Notice that updateOne() will update the first document that it finds in the collection.

m220js-chapter2-user-preferences

Joins

joins are used to combine collections of data $lookup. Compass may be used to create aggregations and then exporting into applications native language (node, java, python3, c#).

$match {year: {“$gte” : 1980, “$lt”: 1990}}

$lookup {from: “comments”, let: {“id”: “$_id”}, pipeline : [{“match”: {“$expr”: {“$eq”: [“$movie_id”: “$$id”]}}}, {“$count” : “count”}], as: “movie_comments”}

from field is joining from. let allows us to declare variables in pipeline, referring to document fields in our source collection. At the end we will have movie_comments array that will have all comments of the movie

m220js-chapter2-getcomments

m220js-chapter2-create-update-comments

Basic Deletes

deleteOne() / delteMany() performs delete counterpart of updateOne() / updateMany(). They change collection data, update indexes and corresponding entries will be added to oplog. Oplog is responsible from replication inside replica set. Remember that deleteOne() will delete the first document it finds in natural order (order in which documents were inserted). deleteMany() deletes all documents that meet predicate.

m220js-chapter2-delete-comments

Chapter #3 Admin Backend

This chapter covers topics:

read concerns
join collections using expressive $lookup
perform bulk operations
clean data

Read concerns

Relates to getting sure on how many nodes are involved in db operation before getting an acknowledgement. Read concerns represents different level of isolation about consistent view of the database, and how consistent we would like the nodes to be. There may appear instances where a data is tried to read without being written to all nodes in db.

By default, MongoDB will use {readConcern : local}, which means that we are only sure that it is written to the primary node. There is a slim chance that, the this read data, may be rolled back due to replication issues into secondaries. In most of the cases this will be fine. However, a higher level of consistency may be achieved by {readConcern: majority}. In this case it is mission critical and we are sure that data read will not be rolled back.

As an example counting the number of comments of a user and getting the most commented 20 of the user emails, following query pipeline may be applied;

let group = {"$group" : {_id: "$email", count:{$sum: 1}}}
let sort = {$sort: {count: -1}}
let limit = {$limit: 20}
const pipeline = [group, sort, limit]

// TODO Ticket: User Report
// Use a more durable Read Concern here to make sure this data is not stale.
const readConcern = {readConcern: {level: "majority"}}

const aggregateResult = await comments.aggregate(pipeline, {
  readConcern,
})

m220js-chapter3-user-reports

Bulk writes

Bulk writes are used to batch a series of write operations into a container (list or array or …, which is implementation detail) send this, and get one acknowledgement to gain efficiency in transport.

Ordered bulk write is the default setting for bulk writes in MongoDB. Executes writes sequentially and will end execution after first write failure. This default action may be overriden with flag {ordered: false}. To make write operations non blocking and overall write parallel, use on ordered bulk write as;

db.someCollection.bulkWrite({{updateOne: {}}, {updateOne: {}}}, {ordered: false})

These individual writes may fail in their own but this will not stop overall operation.

m220js-chapter3-migration

Chapter #4 Resiliency

Chapter is devoted to application resiliency, robustness.

Connection Pooling

Connection pooling is about reusing database connections. Establishing database connection takes time and resources, and if there is possibility of subsequent requests, it may be better to use connections as a pool, rather than creating and destructing per need. After initial connection, subsequent requests appear faster. Default size of pool is 100. A large influx of operations can be handled more quickly with a pool of existing connections. Besides, new operations can be serviced with preexisting connections, so a new connection does not need to be created each time.

m220js-chapter4-connection-pooling

Robust Client Configuration

It is better to always specify a wtimeout with majority writes. If external resources, like network have problems, and if we wait too long for {w: majority} due to external resource problems, we may find ourselves into bottleneck due to acknowledgement latency. Specifying a timeout will make sure, the acknowledgements are not waited more than a preset value. {w: “majority”, wtimeout: 5000} will let 5 second for write acknowledgements. Besides for a server connection, {serverSelectionTimeout: 5} may be used and handled which defaults to 30 seconds. Timeout has no value in read ops & read majority choices.

m220js-chapter4-writetimeout

Writes with Error Handling

Errors are results of nature. Distributed systems are prone to network errors, concurrent systems are prone to duplicate key errors. Duplicate key error occurs when we try to insert a document (_id) in place of already existing one. Result will reveal itself by E11000 duplicate key error. Best action may be creating a new key (_id) and retry the write. Timeout errors are best to be handled by increasing timeout or reducing durability guarantee (reduce writeConcern} if possible. WriteConcernError occurs when we demand a durability that can not be satisfied by the cluster. If replica set has 3 nodes, { writeConcern : 5 } can not be satisfied and produce a writeConcernError. For all these, a try / catch may be used to handle error.

Principle of Least Privilege

Every program and every privileged user of system should operate using the least amount of privilege necessary to complete the job.

Some operational examples:

Assume a election collection as;

{ 
  year: 1828, 
  winner: "Andrew Jackson", 
  winner_running_mate: "John C. Calhoun", 
  winner_party: "Democratic", 
  winner_electoral_votes: 178, 
  total_electoral_votes: 261 
}

Here total_electoral_votes represents the total number of electoral votes that year, and winner_electoral_votes represents the number of electoral votes received by the winning candidates.

To retrieve all the Republican winners with at least 160 electoral votes we can use the following query;

elections.find( { winner_party: "Republican", winner_electoral_votes: { "$gte": 160 } } )

Assuming a phone firmware collection as;

{
  model: 5,
  date_issued : Date("2014-03-04T14:23:43.123Z"),
  software_version: 3.7,
  needs_to_update: true
}

Write an update query that will make needs_to_update field true for documents that have version older than 4.0 ;

phones.updateMany( { software_version: { "$lt": 4.0 } },
                       { "$set": { needs_to_update: true } } )

For a collection like:

{
  name: "Ada",
  height: 1.7
}

Write a query that will find only the 4th- and 5th-tallest people in the people_heights collection?

people_heights.find().sort({ height: -1 }).skip(3).limit(2)

Study notes for M001 MongoDB Basics

24 Wednesday Apr 2019

Posted by Sifa Serdar Ozen in MongoDB

≈ Leave a comment

Tags

M001, MongoDB Basics

MongoDB is a popular document based database, that is getting popular day by day. M001 is their introduction level MOOC that is available from MongoDB University as free.

m001-mongodb-course-overview

The course lasts three weeks with assignments and a final project.

In the course, sample dbs deployed in Atlas (MongoDB offered db as a service platform), is used. In order to access Atlas through user interface, Compass (MongoDB offered user interface client) is needed.

Get compass through MongoDB download center and install it.

m001-download-mongodb-compass

Compass is a visualization interface to MongoDB back end. Database configuration and data exploration is made easy with cool features.

The following example shows an installation in Ubuntu / Debian.

m001-Install-mongodb-from-command-line

In order to display Geospatial data (that will be needed in hands on exercises), a third party service is to be allowed in privacy settings.

m001-enable-geospatial-visualisation

Course provided sample db configuration is as follows;

m001-new-connection-addition-in-compass

The overview will provide which databases are present in selected logical server. We would have information about their collection and sizes. Collections are logical groups of documents in databases. Indexes are mostly due to performance optimization.

m001-connection-to-m001-db

In MongoDB each database and a collection combination defines a namespace. Documents are lowest level entities presented as extended JSON format and form collections.

JSON spec	MongoDB extensions
String	Date
Array	Geospatial data
Object	special keys like $gte, $lt
Boolean
Floating point
Decimal
Null

JSON spec may be referred for complete syntax. Notice that newlines and white spaces outside quotes is not part of json spec.

m001-field-values-in-table-view

Schema analysis gives quick insight o all fields presented in the collection. It is easy to explore list of fields, their data types and summary of a sample range of values.

m001-videos-movies-database-scema

Compass has also quick query field for focused analysis. As an example

{“tripduration”: {“$gte”: 60, “lt”: 65}}

query would provide documents with tripduration field greater and equal to 60 and less than 65.

m001-citibike-trips-filter-to-tripduration

Similarly for citibike.trips namespace

{“birth year”: {“$gte”: 1985, “$lte”: 1990}}

query would provide documents of birth yer between 1985 and 1990.

m001-using-filters-in-mongodb

For video.movies namespace

{“director”: “Patty Jenkins”}

query return documents about movies directed by Patty Jenkins.

m001-video-movie-collection-filtering-due-to-director

MongoDB Query language is CRUD

Create / Read/ Update / Delete

As Compass does not provide all functionality, Mongo shell may can be used all CRUD operations on a text based environment. Shell comes with MongoDB server. Atlas cluster needs ssl support which is available through latest MongoDB enterprise server edition.

Here mongo shell will be installed through packet manager. As default shell from mongo-clients will be old, we need to get from official MongoDB repository.

Add repository

$ echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/4.0 multiverse" \
    | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.lis

Add the public key

$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 \
    --recv 9DA31620334BD75D9DCB49F368818C72E52529D4

update

$ sudo apt-get update

and install

$ sudo apt-get install -y mongodb-org-shell

As a result, we will have a decent Mongo shell to connec to to Atlas

When connecting to cluster through Mongo shell, we may provide all host names, so when primary goes down, we can connect to the others.

Creating a sandbox cluster

It can be used for proof of concept through https://cloud.mongodb.com/links/registerForAtlas M0 instance size is free.

After creating the cluster, we can access it through command line. the connection parameters are provided through

[HERE COMES ATLAS COPY LINK]

The shell will provide basic operations such as;

sifa@sifa:~$ mongo "mongodb+srv://sandbox-zn1ll.mongodb.net/test" --username m001-student

MongoDB Enterprise Sandbox-shard-0:PRIMARY> rs.slaveOk()
MongoDB Enterprise Sandbox-shard-0:PRIMARY> show dbs
admin 0.000GB
local 3.205GB
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

We can load databases into altas cluster through Mongo shell as;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> load("loadMovieDetailsDataset.js")

We can use Mongo shell to insert one item

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.moviesScratch.insertOne({title: "Tosun Pasa", year: 1977});
{
"acknowledged" : true,
"insertedId" : ObjectId("5ccb319437724a809678ebd9")
}
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

And we can also use to insert many items by specfying an array of objects

MongoDB Enterprise Sandbox-shard-0:PRIMARY> 
MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.moviesScratch.insertMany([{title: "Tosun Pasa 2", year: 2017}, {title: "saban oglu saban", year: 1988}]);
{
"acknowledged" : true,
"insertedIds" : [
ObjectId("5ccb32ae37724a809678ebda"),
ObjectId("5ccb32ae37724a809678ebdb")
]
}
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

in individual collections, _ids should be unique. If we fail to do so, we will get duplicate key error. If we would like to continue, we should specify an unordered insert, which will perform all non duplicates without stopping at the first error.

db.moviesScratch.insertMany([{...}, {...}], {ordered: false});

We can do query in Mongo shell as;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> 
MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.moviesScratch.find({year: 1988})
{ "_id" : ObjectId("5ccb32ae37724a809678ebdb"), "title" : "saban oglu saban", "year" : 1988 }

We can also use dot notation to query nested documents. In shell quotes are needed in nested document keys.

db.somecollection.find({"wind.direction.angle": 290}}

We can perform count() operation of a query to get total number of documents satisfying the condition:

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"awards.wins" : 2, "awards.nominations": 2}).count()
12
MongoDB Enterprise Sandbox-shard-0:PRIMARY> 
MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"rated": "PG", "awards.nominations": 10}).count()
3
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

Query on Arrays

Exact array match of queries may be done by;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({cast: ["Saban", "Saban oglu saban"]})

A single element of an array at a specific position

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"cast.0": "Saban oglu saban"})

For example, if we want to count number of movies in movieDetails collection that has “Western” second among the genres;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"genres.1": "Western"}).count()
14

A single element of an array without any position preference

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({cast: "Saban oglu saban"})

For example, if we want to count number of movies in movieDetails collection that has “Family” in its genres array key, we can make a query as;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"genres": "Family"}).count()
124

Projections

By default MongoDB returns all documents field matching query requirement. We can use projection to limit resulting documents. Use 0 for excluding fields, use 1 for including fields. For example, if we want to get only “title” field of our query;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"genres.1": "Western"}, {"title": 1})
{ "_id" : ObjectId("5ccb2bfc37724a809678e2e0"), "title" : "A Million Ways to Die in the West" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e2e1"), "title" : "Wild Wild West" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e346"), "title" : "The Hallelujah Trail" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e348"), "title" : "The Big Trail" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e34a"), "title" : "Crossfire Trail" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e34e"), "title" : "Sukiyaki Western Django" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e353"), "title" : "Western Union" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e35c"), "title" : "Carry on Cowboy" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e35e"), "title" : "Ride 'Em Cowboy" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e4f6"), "title" : "Casa de mi Padre" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e67b"), "title" : "Life Is Tough, Eh Providence?" }
{ "_id" : ObjectId("5ccb2bfc37724a809678e914"), "title" : "Gunfight at the O.K. Corral" }
{ "_id" : ObjectId("5ccb2bfc37724a809678ea76"), "title" : "Ci Xi mi mi sheng huo" }
{ "_id" : ObjectId("5ccb2bfc37724a809678eb36"), "title" : "El Topo" }
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

Updating a single element

updateOne() does update a single element. we can specify _id to be sure a single, or we can specify keys, in that case MongoDB will update the first one it is able to find.

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.updateOne({"title": "The Martian"}, {$set: {poster: "some-nice-poster-web-address"}})
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

For a list of update operations, there is a nice list https://docs.mongodb.com/manual/reference/operator/update-field/

Updating multiple elements simultaneously

updateMany() does updating on multiple elements. There are some fields that are set to null. In order to eliminate null fields, we can use;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.updateMany({rated: null}, {$unset: {rated: ""}})
{ "acknowledged" : true, "matchedCount" : 1599, "modifiedCount" : 1599 }
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

Upsert

We do not need to first query and do an insertion, this can be done by a combined operation as;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.updateOne({"imdb.id": SomeObj.imdb.detail}, {$set: SomeObj}, {upsert: true});

Replace

replaceOne() will apply changes to only one document, the first one found in the server that matches the filter expression. replaceOne will change the whole document, but updateOne is capable to doing field wise changes.

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.updateOne({"imdb.id": SomeObj.imdb.detail}, {$set: SomeObj}, {upsert: true});

deleteOne() / deleteMany()

Similar to other One() / Many() operations.

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.reviews.deleteMany({reviewer_id: 756826})
{ "acknowledged" : true, "deletedCount" : 5 }
MongoDB Enterprise Sandbox-shard-0:PRIMARY> 
MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.reviews.deleteOne({title: "The Martian"})

MongoDB Comparison Operators

$eq, $gt, $gte, $lt, $lte, $ne, $in

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({runtime: {$gt: 240}}, {_id: 0, title: 1, runtime: 1})
{ "title" : "AC/DC: Plug Me In", "runtime" : 420 }
{ "title" : "Heremias: Unang aklat - Ang alamat ng prinsesang bayawak", "runtime" : 540 }
{ "title" : "Tie Xi Qu: West of the Tracks", "runtime" : 551 }
{ "title" : "Tie Xi Qu: West of the Tracks", "runtime" : 551 }
{ "title" : "Tie Xi Qu: West of the Tracks", "runtime" : 551 }
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({runtime: {$eq: 90}}, {_id: 0, title: 1, runtime: 1})
{ "title" : "Best in Show", "runtime" : 90 }
{ "title" : "Alien Outpost", "runtime" : 90 }
{ "title" : "Bill & Ted's Excellent Adventure", "runtime" : 90 }
{ "title" : "Trouble Bound", "runtime" : 90 }
{ "title" : "Little Nicky", "runtime" : 90 }
{ "title" : "Il Bi e il Ba", "runtime" : 90 }
{ "title" : "Fink fährt ab", "runtime" : 90 }

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({runtime: {$gt: 200, $lte: 220}}, {_id: 0, title: 1, runtime: 1})
{ "title" : "Ek Aur Ek Gyarah: By Hook or by Crook", "runtime" : 205 }
{ "title" : "The Godfather: Part II", "runtime" : 202 }
{ "title" : "Attack on Terror: The FBI vs. the Ku Klux Klan", "runtime" : 215 }
{ "title" : "TS Playground 5", "runtime" : 206 }
{ "title" : "The Decline of the Century: Testament L.Z.", "runtime" : 201 }
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

Remember that for $ne, the missing fields are aslo considered ne

$in has a special syntax as value should be an array {$in: [“G”, “PG”]}

We can use $in to query how many movies has writers “Ethan Coen” or “Joel Coen” as

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({writers: {$in: ["Ethan Coen", "Joel Coen"]}}).count()
3

Presence or existance operators

$exists, $type

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({awards: {$exists: true}}).count()
2294
MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({awards: {$exists: false}}).count()
1

In order to make queries of both null and not value at all:

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"tomato.consensus": null}).pretty()

$type filter only gives the document with specified field a specific type

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({"tomato.consensus": {$type: "string"}}).count()
304

Logical Operators

$or, $and

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({$or: [{"tomato.meter": {$gt: 95}}, {"metacritic": {$gt: 88}}]}, {_id: 0, title: 1, "tomato.meter": 1, "metacritic": 1})
{ "title" : "Once Upon a Time in the West", "tomato" : { "meter" : 98 }, "metacritic" : 80 }
{ "title" : "Star Wars: Episode IV - A New Hope", "tomato" : { "meter" : 94 }, "metacritic" : 92 }
{ "title" : "Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb", "tomato" : { "meter" : 99 }, "metacritic" : 96 }
{ "title" : "2001: A Space Odyssey", "tomato" : { "meter" : 96 }, "metacritic" : 86 }
{ "title" : "The Adventures of Robin Hood", "tomato" : { "meter" : 100 }, "metacritic" : 97 }
{ "title" : "The Truman Show", "tomato" : { "meter" : 94 }, "metacritic" : 90 }
{ "title" : "Quiz Show", "tomato" : { "meter" : 96 }, "metacritic" : 88 }
{ "title" : "Evil Dead II", "tomato" : { "meter" : 98 }, "metacritic" : 69 }

Array queries

Query multiple elements in an array field

$all: [“Comedy”, “Crime”, “Drama”] all the elements ( in any order ) must occur

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({genres: {$all: ["Comedy", "Crime", "Drama"]}}).count()
8

Query size of array field

$size filters with the length of the array. For example to filter the movies with just 1 country in the array use;

MongoDB Enterprise Sandbox-shard-0:PRIMARY> db.movieDetails.find({countries: {$size: 1}}).count()
1915
MongoDB Enterprise Sandbox-shard-0:PRIMARY>

element match $elemMatch, makes query conditions mathcing single document in an array

db.movieDetails.find({boxOffice: {$elemMatch: {"country" : "Germany", "revenue": {$gt: 16}}}})

MongoDB Enterprise Sandbox-shard-0:PRIMARY> martian = db.movieDetails({"title" : "The Martian"})
2019-05-09T19:18:49.734+0200 E QUERY [js] TypeError: db.movieDetails is not a function :
@(shell):1:11

To search both a field exists and null use;

{$and: [{tripduration: {$exists: true}}, {tripduration: null}]}

M201 MongoDB Performance study notes day I

23 Thursday Feb 2017

Posted by Sifa Serdar Ozen in MongoDB

≈ Leave a comment

Tags

M201 Performance, MongoDB

This will serve as a small memento of M201 MongoDB Performance.

Lesson highlights for day I

As memory operations are much stronger than I/O operations, MongoDB heavily depend on memory especially for;

aggregation
index traversing
writes (first performed in memory)
query engine
connections (1MB for connection)

CPU power will be needed for;

storage engine (wire tiger)
concurrency model of use (by default all cpu cores are used)
page compression
data calculation
aggregation framework
map reduce

Recommended RAID architecture for MongoDB is Raid10.

Applications connect to MongoS which connects config servers and shards.

Applications should choose wisely;

read concern
write concern
read preference

Lab setup

Here I will leash my mongo instance with vagrant and puppet. My sample configuration will be;

vagrantfile (Vagrantfile)

Vagrant.configure("2") do |config|
 # The most common configuration options are documented and commented below.
 # For a complete reference, please see the online documentation at
 # https://docs.vagrantup.com.

# Every Vagrant development environment requires a box. 
 config.vm.box = "debian81"

# Create a private network, which allows host-only access to the machine
 # using a specific IP.
 config.vm.hostname = "mongodb"
 config.vm.network :private_network, ip: "192.168.10.200"

# Provider-specific configuration so you can fine-tune various
 # backing providers for Vagrant. 
 config.vm.provider "virtualbox" do |vb|
 vb.memory = 2048
 vb.cpus = 1
 end

# Enable provisioning with a shell script. Additional provisioners such as
 # Puppet, Chef, Ansible, Salt, and Docker are also available. 
 config.vm.provision :puppet do |puppet|
 puppet.module_path = "puppet/modules"
 puppet.manifests_path = "puppet/manifests"
 puppet.options = ['--verbose']
 end
 
 config.ssh.private_key_path = ['~/.vagrant.d/insecure_private_key', '~/.ssh/id_rsa', '.vagrant\machines\default\virtualbox\private_key']
 config.ssh.forward_agent = true
 
 
end

puppet manifest (puppet\manifests\default.pp)

# set path for executables
Exec { path => [ "/bin/", "/sbin/" , "/usr/bin/", "/usr/sbin/" ] }

# list packages that should be installed
$system_packages = ['vim', 'git', 'gpp', 'make',]

# perform an apt-get update
exec { 'update':
 command => 'apt-get update',
 require => Exec['mongodb_source_add']
}

# install system packages after an update
package { $system_packages:
 ensure => "installed",
 require => Exec['update']
}

# Import the public key used by the package management system
# sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6
exec { 'mongodb_key_get':
 command => 'apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6'
}

# Create a /etc/apt/sources.list.d/mongodb-enterprise.list file for MongoDB.
# echo "deb http://repo.mongodb.com/apt/debian jessie/mongodb-enterprise/3.4 main" | sudo tee /etc/apt/sources.list.d/mongodb-enterprise.list
exec { 'mongodb_source_add':
 command => 'echo "deb http://repo.mongodb.com/apt/debian jessie/mongodb-enterprise/3.4 main" | sudo tee /etc/apt/sources.list.d/mongodb-enterprise.list',
 require => Exec['mongodb_key_get']
}

# Install the MongoDB Enterprise packages
package { mongodb-enterprise:
 ensure => "installed",
 install_options => ['-y'],
 require => Exec['update']
}

Lab for day I

Lab requires performing a simple query on an imported json dbase. I followed the steps;

get people.json with wget

wget https://university.mongodb.com/static/MongoDB_2017_M201_February/handouts/people.a74d7de502b1.json

start mongod

Start mongod instance, and check contents of log file through tail

sudo service mongod start

Then we may perform queries on db’s as we wish

npm packet manager will be used in development cycle

Callbacks, promises, asynchronous waits

callback:

promise:

asynchronous wait:

Basic Reads

To make a query about countries and return title, a simple solution will be:

Chapter #2 User-Facing Backend

Cursor methods and aggregation equivalents

We can limit the number of documents that cursor will iterate with

Sorting documents (in ascending order)

skipping documents

Aggregation

Basic write operations

Write Concerns

Basic Updates

Joins

Basic Deletes

Chapter #3 Admin Backend

Read concerns

Bulk writes

Chapter #4 Resiliency

Connection Pooling

Robust Client Configuration

Writes with Error Handling

Principle of Least Privilege

Some operational examples:

Share this:

MongoDB Query language is CRUD

Creating a sandbox cluster

Query on Arrays

Projections

Updating a single element

Updating multiple elements simultaneously

Upsert

Replace

deleteOne() / deleteMany()

MongoDB Comparison Operators

Presence or existance operators

Array queries

Query multiple elements in an array field

Query size of array field

Share this:

Lesson highlights for day I

Lab setup

vagrantfile (Vagrantfile)

puppet manifest (puppet\manifests\default.pp)

Lab for day I

get people.json with wget

start mongod

Share this: