Natural Language Processing for Python¶
Embedding¶
CharEmbedding
:PositionEmbedding
:WordEmbedding
:
Text classification¶
Available models¶
All the following models includes Dropout, Pooling and Dense layers with hyperparameters tuned for reasonable performance across standard text classification tasks. If necessary, they are good basis for further performance tuning.
text_cnn
:text_rnn
:attention_rnn
:text_rcnn
:text_han
:
Examples¶
Choose a pre-trained word embedding by setting the embedding_type
and the corresponding embedding dimensions. Set embedding_type=None
to initialize the word embeddings randomly (but make sure to set trainable_embeddings=True
so you actually train the embeddings).
FastText¶
Several pre-trained FastText embeddings are included. For now, we only have the word embeddings and not the n-gram features. All embedding have 300 dimensions.
- English Vectors: e.g.
fasttext.wn.1M.300d
, check out all avaiable embeddings - Multilang Vectors: in the format
fasttext.cc.LANG_CODE
e.g.fasttext.cc.en
- Wikipedia Vectors: in the format
fasttext.wiki.LANG_CODE
e.g.fasttext.wiki.en
.en
##Dataset