Hadoop Spark MCQs
This section focuses on "Spark" of Hadoop. These Multiple Choice Questions (MCQ) should be practiced to improve the hadoop skills required for various interviews (campus interviews, walk-in interviews, company interviews), placements, entrance exams and other competitive examinations.
1. Spark is best suited for ______ data.
Explanation: Spark is best suited for real-time data whereas Hadoop is best suited for structured data.
2. Which of the following Features of Apache Spark?
Explanation: Apache Spark has following features.: speed, Supports multiple languages ,Advanced Analytics.
3. In how many ways Spark uses Hadoop?
Explanation: Spark uses Hadoop in two ways : one is storage and second is processing.
4. When was Apache Spark developed ?
Explanation: Spark is one of Hadoop's sub project developed in 2009 in UC Berkeley's AMPLab by Matei Zaharia.
5. Which of the following is incorrect way for Spark deployment?
Explanation: There are three ways of Spark deployment :-Standalone , Hadoop Yarn, Spark in MapReduce.
6. ____________ is a component on top of Spark Core.
Explanation: Spark SQL introduces a new data abstraction called SchemaRDD, which provides support for structured and semi-structured data.
7. ________ is a distributed graph processing framework on top of Spark.
Explanation: GraphX started initially as a research project at UC Berkeley AMPLab and Databricks, and was later donated to the Spark project.
8. Point out the correct statement.
Explanation: Shark can accelerate Hive queries by as much as 100x when the input data fits into memory, and up 10x when the input data is stored on disk.
9. Which of the following can be used to launch Spark jobs inside MapReduce?
Explanation: With SIMR, users can start experimenting with Spark and use its shell within a couple of minutes after downloading it.
10. Which of the following language is not supported by Spark?
Explanation: The Spark engine runs in a variety of environments, from cloud services to Hadoop or Mesos clusters.