NoSql Interview Questions and Answers (44) - Page 2

List some benefits of Impala

1)One of the key ones is low latency for executing SQL queries on top of Hadoop. And part of this has to do with bypassing the MapReduce infrastructure which involves significant overhead, especially when starting and stopping JBMs.

2)Cloudera also claims several magnitudes of improvement in performance compared to executing the same SQL queries using Hive.

3)Another benefit is that if we really wanted to look under the hood at what Cloudera has provided in Impala or if we wanted to tinker with the code, the source code is available for you to access and download.
Drawback of Impala

1)Impala isn't a GA offering yet.So as a beta offering, it has several limitations in terms of functionality and capability; for example, several of the data sources and file formats aren't yet supported.

2)Also ODBC is currently the only client driver that's available, so if we have JDBC applications we are not able to use them directly yet.

3)Another Impala drawback is that it's only available for use with Cloudera's distribution of Hadoop; that is CDH 4.1.
What are the various categories on NOSQL?

NOTE: This is objective type question, Please click question title for correct answer.
What is Key-Value Store Database?

This kind stores data in a hash table where there is a unique key and a pointer to a particular item of data.e.g.

Key			   Value

----- : ------
Id1_Name : Niladri Biswas
Id1_Citizenship : Indian

Id2_Name : Mike Curz
Id2_Citizenship : American


Since it is guarented to always have a unique key for a particular object, we can query the database for that unique key and get the results back from whichever node has the object.

Examples involve Rika,Dynamo etc.
What is Column Family Database?

A Column Family database stores data column wise rather than row-wise.In a Column Store Databse, each row which is addressed by a key contains one or more "columns".Columns are themselves key-value pairs. The column names need not be predefined, i.e. the structure isn't fixed.Columns in a row are stored in sorted order according to their keys (names).

7DE04F9F-5398-4DC5-9BC4-66CC96A927DE <- Row


Name Location Salary <- col col col
A.Reddy India 2000 <- val val val

CC29FC6D-BA52-4D98-A366-C7F666941C46 <- Row
Name Location Salary <- col col col
A.Cruz Burma 5000 <- val val val
Examples : Cassandra,HBase etc.

What is Document Store Database?

Document database stores semi-structured or unstructured records/documents in JSON format.The informations are encoded using XML, YAML, JSON, BSON, Binary forms like PDF ,Microsoft Office documents (MS Word, Excel, and so on) etc.

The following is a simple document that we can store in a Document Store Database

{ "Question Name"  : "What is MongoDB"

"Author" : "Niladri Biswas"
"Description" : " In this article we will learn about the installation of MongoDB on a 64 bit windows machine."
}


Examples : MongoDB,CouchDB etc.
What is Graph Database?

This kind of NoSQL database fits best in the case where in a connected set of all nodes,edges satisfy a given predicate, starting from a given node.A classic example may be any social engineering site.

Examples : Neo4j etc.
Which of the following is a Document Store Database?

NOTE: This is objective type question, Please click question title for correct answer.
Which command needs to be executed to check if MongoDB is running properly?

NOTE: This is objective type question, Please click question title for correct answer.
What is RavenDB?

RavenDB is a Document Store Database for .NET that stores semi-structured or unstructured records/documents in JSON format.

The following is a simple document that we can store in a Document Store Raven Database

{ "Question Name"  : "How to install RavenDB"

"Author" : "Niladri Biswas"
"Description" : " In this article provide a step by step guidance of installation of RavenDB on a windows machine."
}

What is RSS (Rich Site Summary)?

RSS (Rich Site Summary; originally RDF Site Summary; often called Really Simple Syndication) uses a family of standard web feed formats to publish frequently updated information: blog entries, news headlines, audio, video. An RSS document (called "feed", "web feed" or "channel") includes full or summarized text, and metadata, like publishing date and author's name.RSS is purely a semi-structured/un-structured document data
In RavenDB what does the below statement does? using (var ds = new DocumentStore { Url = "http://localhost:8080", DefaultDatabase = "CRUDDemo" }.Initialize())

As a first step, we are using the DocumentStore class that inherits from the abstract class DocumentStoreBase. The DocumentStore class is manages access to RavenDB and open sessions to work with RavenDB.The DocumentStore class needs a URL and optionally the name of the database. Our RavenDB server is running at 8080 port (at the time of installation we did so). Also we specified a DefaultDatabase name which is CRUDDemo here. The function Initialize() initializes the current instance.
Which feature(s) MongoDB has removed in-order to retain scalability?

NOTE: This is objective type question, Please click question title for correct answer.
In case of MongoDB, what is the advantage of representing the data in BSON format as opposed to JSON?

It is primarily because of the following reasons -

a) Fast machine scan-ability

b) More availability of data types in BSON as opposed to JSON

c) BSON brings more strongly typed system as compared to JSON . BSON is compatible to the Native data structures of languages like C#, Java, Python etc.
By default, which database does MongoDB connect to?

NOTE: This is objective type question, Please click question title for correct answer.
Which of the following data types available in BSON?

NOTE: This is objective type question, Please click question title for correct answer.
What is REDIS?

Redis is an advance Key-Value store, open source, NoSQL database which is primarily use for building highly scalable web applications. Redis holds its database entirely in memory and use the disk only for persistence. It has a rich set of data types like String, List, Hashes, Sets and Sorted Sets with range queries and bitmaps, hyperloglogs and geospatial indexes with radius queries. It finds is use where very high write and read speed is in demand.
Found this useful, bookmark this page to the blog or social networking websites. Page copy protected against web site content infringement by Copyscape

 Interview Questions and Answers Categories