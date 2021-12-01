[experimental] Natural Language Processing functions
warning
This is an experimental feature that is currently in development and is not ready for general use. It will change in unpredictable backwards-incompatible ways in future releases. Set
allow_experimental_nlp_functions = 1 to enable it.
stem
Performs stemming on a given word.
Syntax
stem('language', word)
Arguments
language— Language which rules will be applied. Must be in lowercase. String.
word— word that needs to be stemmed. Must be in lowercase. String.
Examples
Query:
SELECT arrayMap(x -> stem('en', x), ['I', 'think', 'it', 'is', 'a', 'blessing', 'in', 'disguise']) as res;
Result:
┌─res────────────────────────────────────────────────┐
│ ['I','think','it','is','a','bless','in','disguis'] │
└────────────────────────────────────────────────────┘
lemmatize
Performs lemmatization on a given word. Needs dictionaries to operate, which can be obtained here.
Syntax
lemmatize('language', word)
Arguments
language— Language which rules will be applied. String.
word— Word that needs to be lemmatized. Must be lowercase. String.
Examples
Query:
SELECT lemmatize('en', 'wolves');
Result:
┌─lemmatize("wolves")─┐
│ "wolf" │
└─────────────────────┘
Configuration:
<lemmatizers>
<lemmatizer>
<lang>en</lang>
<path>en.bin</path>
</lemmatizer>
</lemmatizers>
synonyms
Finds synonyms to a given word. There are two types of synonym extensions:
plain and
wordnet.
With the
plain extension type we need to provide a path to a simple text file, where each line corresponds to a certain synonym set. Words in this line must be separated with space or tab characters.
With the
wordnet extension type we need to provide a path to a directory with WordNet thesaurus in it. Thesaurus must contain a WordNet sense index.
Syntax
synonyms('extension_name', word)
Arguments
extension_name— Name of the extension in which search will be performed. String.
word— Word that will be searched in extension. String.
Examples
Query:
SELECT synonyms('list', 'important');
Result:
┌─synonyms('list', 'important')────────────┐
│ ['important','big','critical','crucial'] │
└──────────────────────────────────────────┘
Configuration:
<synonyms_extensions>
<extension>
<name>en</name>
<type>plain</type>
<path>en.txt</path>
</extension>
<extension>
<name>en</name>
<type>wordnet</type>
<path>en/</path>
</extension>
</synonyms_extensions>