Python Algorithm Q4
Q4 most-common-word
Q
Given a string paragraph and a string array of the banned words banned, return the most frequent word that is not banned. It is guaranteed there is at least one word that is not banned, and that the answer is unique.
The words in paragraph are case-insensitive and the answer should be returned in lowercase.
Sample Input
Input: paragraph = “Bob hit a ball, the hit BALL flew far after it was hit.”, banned = [“hit”] Output: “ball” Explanation: “hit” occurs 3 times, but it is a banned word. “ball” occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph. Note that words in the paragraph are not case sensitive, that punctuation is ignored (even if adjacent to words, such as “ball,”), and that “hit” isn’t the answer even though it occurs more because it is banned.
Input: paragraph = “a.”, banned = [] Output: “a”
Constraints
- 1 <= paragraph.length <= 1000
- paragraph consists of English letters, space ‘ ‘, or one of the symbols: “!?’,;.”.
- 0 <= banned.length <= 100
- 1 <= banned[i].length <= 10
- banned[i] consists of only lowercase English letters.
1. My Solution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class Solution:
def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:
a = paragraph.lower()
a = re.sub('[\W]',' ',a) #change special letters into space
lis = a.split() # split by space
cntdic = dict(Counter(lis)) #using object 'collections' # make it type dict.
for words in banned: #delete banned words
if words in cntdic:
del cntdic[words]
Ans = [a for a,b in cntdic.items() if b==max(cntdic.values())]
#list comprehension, for searching key with the value.
return (Ans[0])
#for testing
a = Solution()
s = "Bob hit a ball, the hit BALL flew far after it was hit."
a.mostCommonWord(s, ["hit"])
Not that hard question. Just pre-processing the target sentence(lowercase, delete sp letter) and then used ‘counter’ object. This object easily make a group of word and frequency. Then exclude the banned ones. Last one is important. My goal is to check the most frequent and return that word. So we need to search key with the value. I realized this with list comprehenson.
2. Given Solution
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:
words = [word for word in re.sub(r'[^\w]',' ', paragraph).lower().split() if word not in banned]
#using word comprehension, already excepted banned words in definition process.
'''
# the way without counter function. Make counts:words dictionary personally.
counts = collections.defaultdict(int)
for word in words:
counts[word]+=1
return max(counts, key=counts.get)
'''
counts = collections.Counter(words)
return counts.most_common(1)[0][0]
the method “most_common()” of the object “Counter” is very useful. .most_common(X) means that return the list of the tuple of most X common words. if you put 3, it returns top 3 frequent words-counter tuple. Therefore we use (1)[0][0]. (1) -> the most frequent / 1st [0]-> list to tuple / 2nd [0] -> tuple to value itself.
Overall Review
Using list comprehension makes the code really clean. Also it’s so comfortable just like lambda function. Make sure to use it when neccessary.
And realize that python really contains incountable libraries. Just like “Counter”. Learning various libraries is inevitable in python.