![](/assets/images/picto-drapeau-france-3a76576a5d60a512053b4612ab58dae5.png)
1. Typewriters: language models
1.1 Spam filtering
Let's start with an elementary language processing task: spam filtering. Its probabilistic processing involves three steps:
1. collection of a representative set of e-mails, containing a set D ok of acceptable e-mails and a set D ko of undesirable e-mails ;
2. construction of a numerical representation for texts. A very
simple representation transforms each e-mail d into a large binary
vector h in
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
![](/assets/images/logo-eti-286623ed91fa802ce039246e516e5852.png)
The Ultimate Scientific and Technical Reference
This article is included in
Technological innovations
This offer includes:
Knowledge Base
Updated and enriched with articles validated by our scientific committees
Services
A set of exclusive tools to complement the resources
Practical Path
Operational and didactic, to guarantee the acquisition of transversal skills
Doc & Quiz
Interactive articles with quizzes, for constructive reading
Typewriters: language models
Bibliography
- (1) - AHARONI (R.), JOHNSON (M.), FIRAT (O.) - Massively multilingual neural machine translation. - Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics : Human language technologies, volume 1 (long and short papers), Association for Computational Linguistics, p. 3874-3884 (2019)....
Exclusive to subscribers. 97% yet to be discovered!
You do not have access to this resource.
Click here to request your free trial access!
Already subscribed? Log in!
![](/assets/images/logo-eti-286623ed91fa802ce039246e516e5852.png)
The Ultimate Scientific and Technical Reference