Before the 6-months of age, infants succeed to learn words associated with objects and actions when the words are presented isolated or embedded in short utterances. It remains unclear whether such type of learning occurs from fluent audiovisual stimuli, although in natural environments the fluent audiovisual contexts are the default. In 4 experiments, we evaluated if 8-month-old infants could learn word-action and word-object associations from fluent audiovisual streams when the words conveyed either vowel or consonant harmony, two phonological cues that benefit word learning near 6 and 12 months of age, respectively. We found that infants learned both types of words, but only when the words contained vowel harmony. Because object- and action-words have been conceived as rudimentary representations of nouns and verbs, our results suggest that vowels contribute to shape the initial steps of the learning of lexical categories in preverbal infants.