linkedin-skill-assessments-quizzes

Hadoop

Q1. Partitioner controls the partitioning of what data?

Q2. SQL Windowing functions are implemented in Hive using which keywords?

Q3. Rather than adding a Secondary Sort to a slow Reduce job, it is Hadoop best practice to perform which optimization?

Q5. MapReduce jobs can be written in which language?

Q6. To perform local aggregation of the intermediate outputs, MapReduce users can optionally specify which object?

Q7. To verify job status, look for the value ___ in the ___.

Q8. Which line of code implements a Reducer method in MapReduce 2.0?

Q9. To get the total number of mapped input records in a map job task, you should review the value of which counter?

Q10. Hadoop Core supports which CAP capabilities?

Q11. What are the primary phases of a Reducer?

Q12. To set up Hadoop workflow with synchronization of data between jobs that process tasks both on disk and in memory, use the ___ service, which is ___.

Q13. For high availability, which type of multiple nodes should you use?

Q14. DataNode supports which type of drives?

Q15. Which method is used to implement Spark jobs?

Q16. In a MapReduce job, where does the map() function run?

Q17. To reference a master file for lookups during Mapping, what type of cache should be used?

Q18. Skip bad records provides an option where a certain set of bad input records can be skipped when processing what type of data?

Q19. Which command imports data to Hadoop from a MySQL database?

Q20. In what form is Reducer output presented?

Q21. Which library should be used to unit test MapReduce code?

Q22. If you started the NameNode, then which kind of user must you be?

Q23. State _ between the JVMs in a MapReduce job

Q24. To create a MapReduce job, what should be coded first?

Q25. To connect Hadoop to AWS S3, which client should you use?

Q26. HBase works with which type of schema enforcement?

Q27. HDFS files are of what type?

Q28. A distributed cache file path can originate from what location?

Q29. Which library should you use to perform ETL-type MapReduce jobs?

Q30. What is the output of the Reducer?

map function processes a certain key-value pair and emits a certain number of key-value pairs and the Reduce function processes values grouped by the same key and emits another set of key-value pairs as output.

Q31. To optimize a Mapper, what should you perform first?

Q32. When implemented on a public cloud, with what does Hadoop processing interact?

Q33. In the Hadoop system, what administrative mode is used for maintenance?

Q34. In what format does RecordWriter write an output file?

Q35. To what does the Mapper map input key/value pairs?

Q36. Which Hive query returns the first 1,000 values?

Q37. To implement high availability, how many instances of the master node should you configure?

Q38. Hadoop 2.x and later implement which service as the resource coordinator?

Q39. In MapReduce, _ have _

Q40. What type of software is Hadoop Common?

Q41. If no reduction is desired, you should set the numbers of _ tasks to zero.

Q42. MapReduce applications use which of these classes to report their statistics?

Q43. _ is the query language, and _ is storage for NoSQL on Hadoop.

Q44. MapReduce 1.0 _ YARN.

Q45. Which type of Hadoop node executes file system namespace operations like opening, closing, and renaming files and directories?

Q46. HQL queries produce which job types?

Q47. Suppose you are trying to finish a Pig script that converts text in the input string to uppercase. What code is needed on line 2 below?

1 data = LOAD ‘/user/hue/pig/examples/data/midsummer.txt’… 2

Q48. In a MapReduce job, which phase runs after the Map phase completes?

Q49. Where would you configure the size of a block in a Hadoop environment?

Q50. Hadoop systems are _ RDBMS systems.

Q51. Which object can be used to distribute jars or libraries for use in MapReduce tasks?

Q52. To view the execution details of an Impala query plan, which function would you use?

Q53. Which feature is used to roll back a corrupted HDFS instance to a previously known good point in time?

Reference

Q54. Hadoop Common is written in which language?

Q55. Which file system does Hadoop use for storage?

Q56. What kind of storage and processing does Hadoop support?

Q57. Hadoop Common consists of which components?

Q58. Most Apache Hadoop committers’ work is done at which commercial company?

Q59. To get information about Reducer job runs, which object should be added?

Q60. After changing the default block size and restarting the cluster, to which data does the new size apply?

Q61. Which statement should you add to improve the performance of the following query?

SELECT
  c.id,
  c.name,
  c.email_preferences.categories.surveys
FROM customers c;

Q62. What custom object should you implement to reduce IO in MapReduce?

Q63. You can optimize Hive queries using which method?

Q64. If you are processing a single action on each input, what type of job should you create?

Q65. The simplest possible MapReduce job optimization is to perform which of these actions?

Q66. When you implement a custom Writable, you must also define which of these object?

Q67. To copy a file into the Hadoop file system, what command should you use?

Q68. Delete a Hive _ table and you will delete the table _.

Q69. To see how Hive executed a JOIN operation, use the _ statement and look for the _ value.

Q70. Pig operates in mainly how many nodes?

Q71. After loading data, _ and then run a(n) _ query for interactive queries.

Q72. In Hadoop MapReduce job code, what must be static?

Reference

Q73. In Hadoop simple mode, which object determines the identity of a client process?

Reference

Q74. Which is not a valid input format for a MapReduce job?

Reference

Q75. If you see org.apache.hadoop.mapred, which version of MapReduce are you working with?

Reference