File size: 403 Bytes
5fa1a76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def select_speaker(speaker_id):
     return 100 <= speaker_counts[speaker_id] <= 400
dataset = dataset.filter(select_speaker, input_columns=["speaker_id"])

Let's check how many speakers remain: 

len(set(dataset["speaker_id"]))
42

Let's see how many examples are left: 

len(dataset)
9973

You are left with just under 10,000 examples from approximately 40 unique speakers, which should be sufficient.