Emoji have become ubiquitous in written communication, on the Web and beyond. They can emphasize or clarify emotions, add details to conversations, or simply serve decorative purposes. This casual use, however, barely scratches the surface of the expressive power of emoji. To further unleash this power, we present Emojinize, a method for translating arbitrary text phrases into sequences of one or more emoji without requiring human input. By leveraging the power of large language models, Emojinize can choose appropriate emoji by disambiguating based on context (eg, cricket-bat vs bat) and can express complex concepts compositionally by combining multiple emoji (eq, "Emojinize" is translated to input-latin-letters right-arrow grinning-face). In a cloze test--based user study, we show that Emojinize's emoji translations increase the human guessability of masked words by 55%, whereas human-picked emoji translations do so by only 29%. These results suggest that emoji provide a sufficiently rich vocabulary to accurately translate a wide variety of words. Moreover, annotating words and phrases with Emojinize's emoji translations opens the door to numerous downstream applications, including children learning how to read, adults learning foreign languages, and text understanding for people with learning disabilities.
The learning of a new language remains to this date a cognitive task that requires considerable diligence and willpower, recent advances and tools notwithstanding. In this paper, we propose Broccoli, a new paradigm aimed at reducing the required effort by seamlessly embedding vocabulary learning into users' everyday information diets. This is achieved by inconspicuously switching chosen words encountered by the user for their translation in the target language. Thus, by seeing words in context, the user can assimilate new vocabulary without much conscious effort. We validate our approach in a careful user study, finding that the efficacy of the lightweight Broccoli approach is competitive with traditional, memorization-based vocabulary learning. The low cognitive overhead is manifested in a pronounced decrease in learners' usage of mnemonic learning strategies, as compared to traditional learning. Finally, we establish that language patterns in typical information diets are compatible with spaced-repetition strategies, thus enabling an efficient use of the Broccoli paradigm. Overall, our work establishes the feasibility of a novel and powerful "install-and-forget" approach for embedded language acquisition.