Hadoop Spark MCQs
Hadoop Spark MCQs : This section focuses on "Spark" of Hadoop. These Multiple Choice Questions (MCQ) should be practiced to improve the hadoop skills required for various interviews (campus interviews, walk-in interviews, company interviews), placements, entrance exams and other competitive examinations.
1. Spark is best suited for ______ data.
A. Real-time
B. Virtual
C. Structured
D. All of the above
View Answer
Ans : A
Explanation: Spark is best suited for real-time data whereas Hadoop is best suited for structured data.
2. Which of the following Features of Apache Spark?
A. Speed
B. Supports multiple languages
C. Advanced Analytics
D. All of the above
View Answer
Ans : D
Explanation: Apache Spark has following features.: speed, Supports multiple languages ,Advanced Analytics.
3. In how many ways Spark uses Hadoop?
A. 2
B. 3
C. 4
D. 5
View Answer
Ans : A
Explanation: Spark uses Hadoop in two ways : one is storage and second is processing.
4. When was Apache Spark developed ?
A. 2007
B. 2008
C. 2009
D. 2010
View Answer
Ans : C
Explanation: Spark is one of Hadoop's sub project developed in 2009 in UC Berkeley's AMPLab by Matei Zaharia.
5. Which of the following is incorrect way for Spark deployment?
A. Standalone
B. Hadoop Yarn
C. Spark in MapReduce
D. Spark SQL
View Answer
Ans : D
Explanation: There are three ways of Spark deployment :-Standalone , Hadoop Yarn, Spark in MapReduce.
6. ____________ is a component on top of Spark Core.
A. Spark Streaming
B. Spark SQL
C. RDDs
D. None of the above
View Answer
Ans : B
Explanation: Spark SQL introduces a new data abstraction called SchemaRDD, which provides support for structured and semi-structured data.
7. ________ is a distributed graph processing framework on top of Spark.
A. MLlib
B. Spark Streaming
C. GraphX
D. None of the above
View Answer
Ans : C
Explanation: GraphX started initially as a research project at UC Berkeley AMPLab and Databricks, and was later donated to the Spark project.
8. Point out the correct statement.
A. Spark enables Apache Hive users to run their unmodified queries much faster
B. Spark interoperates only with Hadoop
C. Spark is a popular data warehouse solution running on top of Hadoop
D. All of the above
View Answer
Ans : A
Explanation: Shark can accelerate Hive queries by as much as 100x when the input data fits into memory, and up 10x when the input data is stored on disk.
9. Which of the following can be used to launch Spark jobs inside MapReduce?
A. SIM
B. SIMR
C. SIR
D. RIS
View Answer
Ans : B
Explanation: With SIMR, users can start experimenting with Spark and use its shell within a couple of minutes after downloading it.
10. Which of the following language is not supported by Spark?
A. Python
B. Scala
C. Java
D. Pascal
View Answer
Ans : D
Explanation: The Spark engine runs in a variety of environments, from cloud services to Hadoop or Mesos clusters.
Discussion