TDA-C01 Exam - Question 32

Question

You have the following dataset. Which grouping option should you use in Tableau Prep to group all five names automatically?

Examice · Accepted Answer

The correct answer is 'Common Characters'. This grouping option works by finding and grouping values that share common letters or numbers. It uses the ngram fingerprint algorithm, which indexes words by their unique characters after removing punctuation, duplicates, and whitespace. This method is effective for handling variations in formatting such as different delimiters (underscores, commas, spaces, and hyphens), and it can group names like those in the dataset (with different formatting and capitalization) automatically.

84db7a1 · Answer

Answer is B

Common Characters: Find and group values that have letters or numbers in common. This option uses the ngram fingerprint algorithm that indexes words by their unique characters after removing punctuation, duplicates, and whitespace. This algorithm works for any supported language. This option isn't available for data roles.

For example, this algorithm would match names that are represented as "John Smith" and "Smith, John" because they both generate the key "hijmnost". Since this algorithm doesn't consider pronunciation, the value "Tom Jhinois" would have the same key "hijmnost" and would also be included in the group.

iccent2 · Answer

I agree that the correct answer is B and not C. Here is why:
Pronunciation might not group variations like "John_smith" and "John, Smith" because they differ significantly in spelling and format.
Spelling focuses on minor differences in spelling but might not handle different delimiters (like underscores, commas, spaces, and hyphens) effectively.
Manual Selection would require you to manually select and group each variation, which isn't automatic.
The Common Characters option looks for similar sequences of characters within the names, making it effective at grouping variations such as "John, Smith", "John_smith", "John Smith", and "John-smith" automatically.

MonBouj · Answer

B. Common Characters

Common Characters: This method is useful to fix capitalization or formatting issues.

TDA-C01 Exam - Question 32

Discussion