Hadoop lucene MCQs
Hadoop lucene MCQs : This section focuses on "Lucene" in Hadoop. These Multiple Choice Questions (MCQ) should be practiced to improve the hadoop skills required for various interviews (campus interviews, walk-in interviews, company interviews), placements, entrance exams and other competitive examinations.
1. ___________ provides Java-based indexing and search technology.
A. solar
B. Lucene Core
C. Lucene
D. None of the above
View Answer
Ans : B
Explanation: Lucene provides spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
2. Point out the correct statement.
A. Building PyLucene requires GNU Make, a recent version of Ant capable of building Java Lucene and a C++ compiler
B. PyLucene is supported on Mac OS X, Linux, Solaris and Windows
C. Use of setuptools is recommended for Lucene
D. All of the above
View Answer
Ans : D
Explanation: PyLucene requires Python version 2.x (x >= 3.5) and Java version 1.x (x &t;= 5).
3. ____________ is a subproject with the aim of collecting and distributing free materials.
A. ORP
B. OPR
C. OCR
D. ORC
View Answer
Ans : A
Explanation: Open Relevance Project is used for relevance testing and performance.
4. Lucene provides scalable, high-Performance indexing over ______ per hour on modern hardware.
A. 1GB
B. 150GB
C. 1TB
D. 150TB
View Answer
Ans : B
Explanation: Lucene offers powerful features through a simple API.
5. Lucene index size is roughly _______ the size of text indexed.
A. 0.1
B. 0.15
C. 0.2
D. 0.25
View Answer
Ans : C
Explanation: Lucene provides incremental indexing as fast as batch indexing.
6. All file access uses Java's __________ APIs which give Lucene stronger index safety.
A. NIO.1
B. NIO.2
C. NIO.3
D. NIO.4
View Answer
Ans : B
Explanation: Index safety is provided in terms of better error handling and safer commits.
7. Heap usage during IndexWriter merging is also much lower with the new _________
A. LucCodec
B. Lucene50Codec
C. Lucene20Cod
D. All of the above
View Answer
Ans : B
Explanation: Doc values and norms for the segments being merged are no longer fully loaded into heap for all fields
8. PostingsFormat now uses a __________ API when writing postings, just like doc values.
A. read
B. write
C. push
D. pull
View Answer
Ans : D
Explanation: This is powerful because you can do things in your postings format that require making more than one pass through the postings such as iterating over all postings.
9. SolrJ now has first class support for __________ API.
A. Compactions
B. Collection
C. Distribution
D. All of the above
View Answer
Ans : B
Explanation: Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene.
10. ____________ can be used to generate stats over the results of arbitrary numeric functions.
A. stats.field
B. sta.field
C. stats.value
D. stat.value
View Answer
Ans : A
Explanation: stats.field allows for requesting for statistics for pivot facets using tags.
Discussion