Basic Couchbase querying for SQL people

This is the second of a series of posts about Couchbase. For the first one, click here.

Probably, you all know how to query a relational database. It is done using SQL, a declarative (non-imperative) language that is widely used to insert and retrieve information from databases.

Taking a users table A typical SQL query should look like this:


SELECT u.id, u.username, u.first_name, u.last_name, u.last_login
FROM users u
WHERE u.id = “1111”;


But... How do I create a query similar to the one above in Couchbase? Well. The answer is: Creating a view

CREATING A VIEW
Download and install Couchbase server on your computer, and go through the set-up process. Then, you’ll need to create some documents. You can download these sample documents and use them in the following examples.

Once you have created your documents, go to the VIEWS tab.




Once there, you’ll see something like this page.



As you have probably never created a view on your system, your views list will be empty. Let’s start by creating the easiest query ever. Click on “Create Development View” and use these names to create the view:

Design Document Name: _design/dev_users
View Name: by_id

And click on Save.




Now, your design document and your view have been created. Right now, it would be easier for you to think in a design document as in a package of views. And to think in a view as in a query. At least, it was easier for me. You should be seeing a screen like this:





Click on the name of the view. You can create your query in the next page.

Basic Querying - SELECT

There are two text inputs where you can type the code of your view. The one on the left is for the MAP function. The one on the right is for the REDUCE function. You can read a lot more on how this map and reduce things work on the Couchbase Server Manual





But, basically, you’ll need the map function to return data from your database, and the reduce function for calculating and aggregating the data (like in SQL SELECT COUNT(*), for instance). In Couchbase, both of them are created using JavaScript, so you don’t need to learn a new tricky language. Let’s focus on retrieve the information, by now.

Leave the textbox on the left with the sample code, like this:

function (doc) {
emit(doc._id, null);
}


Map functions are used to create an index which is queried every time you ask Couchbase for some information. The “emit” call stands for “add this key/value pair to the index”. With the basic code, we are adding a pair key=doc._id and value=null to the index, which will work for a lot of queries. Save the view and click on Show Results twice. You will see the results of your view like this:




This method is not the most comfortable one to see the results of a view. It works for a few results, but not for big sets of data. I encourage you to open a new browser tab and navigate to this URL:

http://127.0.0.1:8092/default/_design/dev_users/_view/by_id?connection_timeout=60000&limit=10&skip=0


(Notice the words in bold. These are the names you specified a minute ago)

So you’ll get this:
{"total_rows":3,"rows":
[
{"id":"1111","key":"1111","value":null},
{"id":"2222","key":"2222","value":null},
{"id":"3333","key":"3333","value":null}
]
}

So... What has happened? By emit-ing the id of the document, we have SELECTED the documents that the view will retrieve. For each row, you can see an attribute named “key”, which is the first argument of the emit function above and an attribute named “value”, that has a null value and that is the second argument of the same emit call.

In SQL, you have done this:

SELECT u.id
FROM users u


Going back to the SQL query at the beginning of this post, we wanted to retrieve multiple information. We can do this by changing the emit call and add some value to the second argument. Edit the view and type this code:


function (doc) {
emit(doc._id, [doc.username, doc.first_name, doc.last_name, doc.last_login]);
}


If you reload the previous URL, you’ll see a new results, like those:
{"total_rows":3,"rows":
[
{"id":"1111","key":"1111","value":["alice","Alice","Abrahams","2012-08-31 01:06:35"]},
{"id":"2222","key":"2222","value":["bob","Bob","Bacon","2012-08-30 02:06:35"]},
{"id":"3333","key":"3333","value":["charlie","Charles","Celsius","2012-08-29 02:06:35"]}
]
}

Which is exactly what you wanted to retrieve by your first SQL query. By emitting the values that you want to retrieve, you’ll be doing the equivalent to specifying the names of the attributes in the SELECT clause in SQL.

 

ADDING A WHERE CLAUSE

Probably the most common operation in SQL is selecting a row by its ID. This can be easily achieved in Couchbase, and you don’t even have to modify your view code. Simply go to this URL

http://127.0.0.1:8092/default/_design/dev_users/_view/by_id?connection_timeout=60000&limit=10&skip=0&key=”1111”


This is what you’ll get:
{"total_rows":3,"rows":
[
{"id":"1111","key":"1111","value":["alice","Alice","Abrahams","2012-08-31 01:06:35"]}
]
}

By adding the key=some_value argument to the URL, you’ll get only the results matching that key. By the way: The emitted keys can have repeated values. This is the way to retrieve multiple results at the same time.

Imagine you want to get all the users from the same city. Then, you’ll create a view named “users / by_city”, with the following map function:


function (doc) {
emit(doc.city, [doc.username, doc.first_name, doc.last_name, doc.last_login]);
}


Accesing it from http://127.0.0.1:8092/default/_design/dev_users/_view/by_city?connection_timeout=60000&limit=10&skip=0 will return:
{"total_rows":3,"rows":
[
{"id":"1111","key":"Barcelona","value":"alice","Alice","Abrahams","2012-08-31 01:06:35"]},
{"id":"2222","key":"Barcelona","value":["bob","Bob","Bacon","2012-08-30 02:06:35"]},
{"id":"3333","key":"Boston","value":["charlie","Charles","Celsius","2012-08-29 02:06:35"]}
]
}

If you now specify a key in the URL, like this:

http://127.0.0.1:8092/default/_design/dev_users/_view/by_city?connection_timeout=60000&limit=10&skip=0&key=”Barcelona”

You’ll get only these two results:
{"total_rows":3,"rows":
[
{"id":"1111","key":"Barcelona","value":["alice","Alice","Abrahams","2012-08-31 01:06:35"]},
{"id":"2222","key":"Barcelona","value":["bob","Bob","Bacon","2012-08-30 02:06:35"]}
]
}

SELECT COUNT
In other scenarios, you’ll need to SELECT COUNT(*). You can do this by using the REDUCE function. Couchbase includes a set of the most common reduce functions.

Go back and edit the “users / by_city” view that you created. On the textbox on the right, type _sum. The view should look like this:


Then, open http://127.0.0.1:8092/default/_design/dev_users/_view/by_city?connection_timeout=60000&limit=10&skip=0&group=true&group_level=1
{"rows":
[
{"key":"Barcelona","value":2},
{"key":"Boston","value":1}
]
}

There you go. You have performed the equivalent to SQL:


SELECT COUNT(*)
FROM users
GROUP BY users.city


If you need to retrieve the amount of users living in a single city, just add the “key=” argument to the previous URL, like this:

http://127.0.0.1:8092/default/_design/dev_users/_view/by_city?connection_timeout=60000&limit=10&skip=0&group=true&group_level=1&key=”Barcelona”

Then, you’ll get:
{"rows":
[
{"key":"Barcelona","value":2}
]
}

Which is the equivalent to:

SELECT COUNT(*)
FROM users
WHERE users.city=”Barcelona”


 

ORDER BY
Every view will return an ordered set of rows, based on the lexicographical order of the emitted key, according to unicode. It’s roughly like "ORDER BY key ASC" in SQL. If you want your results in the reserve order, just add a param “descending” to the view URL. For instance, this call:

http://127.0.0.1:8092/default/_design/dev_users/_view/by_id?connection_timeout=60000&limit=10&skip=0&descending=true

will return:
{"total_rows":3,"rows":
[
{"id":"3333","key":"3333","value":["charlie","Charles","Celsius","2012-08-29 02:06:35"]},
{"id":"2222","key":"2222","value":["bob","Bob","Bacon","2012-08-30 02:06:35"]},
{"id":"1111","key":"1111","value":["alice","Alice","Abrahams","2012-08-31 01:06:35"]}
]
}

 

FROM CLAUSE

Couchbase doesn’t have tables, so there is no such a thing as “SELECT * FROM …”. If you need that kind of partitions in your database (you probably will, by the way), you need to add an additional attribute to the documents. In the sample documents, this is called “jsonType” and simulates two different tables: active_users and inactive_users.

In this case, a query like:

SELECT u.id, u.username, u.first_name, u.last_name, u.last_login
FROM active_users u


Will become to a view having this map function:

function (doc) {
if(doc.jsonType == "active_user"){
emit(doc._id, [doc.username, doc.first_name, doc.last_name, doc.last_login]);
}
}


which will return this info:
{"total_rows":2,"rows":
[
{"id":"1111","key":"1111","value":["alice","Alice","Abrahams","2012-08-31 01:06:35"]},
{"id":"2222","key":"2222","value":["bob","Bob","Bacon","2012-08-30 02:06:35"]}
]
}

As you see, performing basic SELECT operations in Couchbase is simple, very simple. In a next post, I'll talk about more advanced topics such as selecting multiple keys at a time, joins, pagination and selecting between two given values.




Popular posts from this blog

Diseño y comités

Revisiting charts: NodeJS, Twitter and Elasticsearch

Creating an e-commerce platform with Couchbase 2.0