You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
stephantul edited this page Feb 21, 2020
·
3 revisions
The pattern.it module contains a fast part-of-speech
tagger for Italian (identifies nouns, adjectives, verbs, etc. in a
sentence) and tools for Italian verb conjugation and noun
singularization & pluralization.
The functions in this module take the same parameters and return the
same values as their counterparts in pattern.en.
Refer to the documentation there for more details.
Gender
Italian nouns and adjectives inflect according to gender. The gender() function predicts the gender (MALE, FEMALE, PLURAL) of a given noun with about 92%
accuracy:
The article() function returns the
article (INDEFINITE or DEFINITE) inflected by gender (e.g., il gatto → i gatti).
>>> from pattern.it import article, DEFINITE, MALE, PLURAL
>>> print article('gatti', DEFINITE, gender=(MALE, PLURAL))
i
Noun singularization & pluralization
For Italian nouns there is singularize() and pluralize(). The implementation is slightly
less robust than the English version (accuracy 84% for singularization
and 93% for pluralization).
For Italian verbs there is conjugate(),
lemma(), lexeme() and tenses(). The lexicon for verb conjugation
contains about 1,250 common Italian verbs, mined from Wiktionary. For
unknown verbs it will fall back to a rule-based approach with an
accuracy of about 86%.
Italian verbs have more tenses than English verbs. In particular, the
plural differs for each person, and there are additional forms for
the FUTURE tense, the IMPERATIVE, CONDITIONAL and SUBJUNCTIVE mood and the PERFECTIVE aspect:
>>> from pattern.it import conjugate
>>> from pattern.it import INFINITIVE, PRESENT, PAST, SG, SUBJUNCTIVE, PERFECTIVE
>>>
>>> print conjugate('sono', INFINITIVE)
>>> print conjugate('sono', PRESENT, 1, SG, mood=SUBJUNCTIVE)
>>> print conjugate('sono', PAST, 3, SG)
>>> print conjugate('sono', PAST, 3, SG, aspect=PERFECTIVE)
essere
sia
era
fu
For PAST tense + PERFECTIVE aspect we can also use PRETERITE (passato remoto) For PAST tense + IMPERFECTIVE aspect we can also use IMPERFECT (imperfetto).
>>> from pattern.it import conjugate
>>> from pattern.it import IMPERFECT, PRETERITE
>>>
>>> print conjugate('sono', IMPERFECT, 3, SG)
>>> print conjugate('sono', PRETERITE, 3, SG)
era
fu
The conjugate() function takes the
following optional parameters:
Tense
Person
Number
Mood
Aspect
Alias
Example
INFINITVE
None
None
None
None
"inf"
essere
PRESENT
1
SG
INDICATIVE
IMPERFECTIVE
"1sg"
io __sono__
PRESENT
2
SG
INDICATIVE
IMPERFECTIVE
"2sg"
tu __sei__
PRESENT
3
SG
INDICATIVE
IMPERFECTIVE
"3sg"
lui __è__
PRESENT
1
PL
INDICATIVE
IMPERFECTIVE
"1pl"
noi __siamo__
PRESENT
2
PL
INDICATIVE
IMPERFECTIVE
"2pl"
voi __siete__
PRESENT
3
PL
INDICATIVE
IMPERFECTIVE
"3pl"
loro __sono__
PRESENT
None
None
INDICATIVE
PROGRESSIVE
"part"
essendo
PRESENT
2
SG
IMPERATIVE
IMPERFECTIVE
"2sg!"
sii
PRESENT
3
SG
IMPERATIVE
IMPERFECTIVE
"3sg!"
sia
PRESENT
1
PL
IMPERATIVE
IMPERFECTIVE
"1pl!"
siamo
PRESENT
2
PL
IMPERATIVE
IMPERFECTIVE
"2pl!"
siate
PRESENT
3
PL
IMPERATIVE
IMPERFECTIVE
"3pl!"
siano
PRESENT
1
SG
SUBJUNCTIVE
IMPERFECTIVE
"1sg?"
io __sia__
PRESENT
2
SG
SUBJUNCTIVE
IMPERFECTIVE
"2sg?"
tu __sia__
PRESENT
3
SG
SUBJUNCTIVE
IMPERFECTIVE
"3sg?"
lui __sia__
PRESENT
1
PL
SUBJUNCTIVE
IMPERFECTIVE
"1pl?"
noi __siamo__
PRESENT
2
PL
SUBJUNCTIVE
IMPERFECTIVE
"2pl?"
voi __siate__
PRESENT
3
PL
SUBJUNCTIVE
IMPERFECTIVE
"3pl?"
loro __siano__
PAST
1
SG
INDICATIVE
IMPERFECTIVE
"1sgp"
io __ero__
PAST
2
SG
INDICATIVE
IMPERFECTIVE
"2sgp"
tu __eri__
PAST
3
SG
INDICATIVE
IMPERFECTIVE
"3sgp"
lui __era__
PAST
1
PL
INDICATIVE
IMPERFECTIVE
"1ppl"
noi __e____ravamo__
PAST
2
PL
INDICATIVE
IMPERFECTIVE
"2ppl"
voi __eravate__
PAST
3
PL
INDICATIVE
IMPERFECTIVE
"3ppl"
loro __erano__
PAST
None
None
INDICATIVE
PROGRESSIVE
"ppart"
stato
PAST
1
SG
INDICATIVE
PERFECTIVE
"1sgp+"
io __fui__
PAST
2
SG
INDICATIVE
PERFECTIVE
"2sgp+"
tu __fosti__
PAST
3
SG
INDICATIVE
PERFECTIVE
"3sgp+"
lui __fu__
PAST
1
PL
INDICATIVE
PERFECTIVE
"1ppl+"
noi __fummo__
PAST
2
PL
INDICATIVE
PERFECTIVE
"2ppl+"
voi __foste__
PAST
3
PL
INDICATIVE
PERFECTIVE
"3ppl+"
loro __furono__
PAST
1
SG
SUBJUNCTIVE
IMPERFECTIVE
"1sgp?"
io __fossi__
PAST
2
SG
SUBJUNCTIVE
IMPERFECTIVE
"2sgp?"
tu __fossi__
PAST
3
SG
SUBJUNCTIVE
IMPERFECTIVE
"3sgp?"
lui __fosse__
PAST
1
PL
SUBJUNCTIVE
IMPERFECTIVE
"1ppl?"
noi __fossimo__
PAST
2
PL
SUBJUNCTIVE
IMPERFECTIVE
"2ppl?"
voi __foste__
PAST
3
PL
SUBJUNCTIVE
IMPERFECTIVE
"3ppl?"
loro __fossero__
FUTURE
1
SG
INDICATIVE
IMPERFECTIVE
"1sgf"
io __sarò__
FUTURE
2
SG
INDICATIVE
IMPERFECTIVE
"2sgf"
tu __sarai__
FUTURE
3
SG
INDICATIVE
IMPERFECTIVE
"3sgf"
lui __sarà__
FUTURE
1
PL
INDICATIVE
IMPERFECTIVE
"1plf"
noi __saremo__
FUTURE
2
PL
INDICATIVE
IMPERFECTIVE
"2plf"
voi __sarete__
FUTURE
3
PL
INDICATIVE
IMPERFECTIVE
"3plf"
loro __saranno__
CONDITIONAL
1
SG
INDICATIVE
IMPERFECTIVE
"1sg->"
io __sarei__
CONDITIONAL
2
SG
INDICATIVE
IMPERFECTIVE
"2sg->"
tu __saresti__
CONDITIONAL
3
SG
INDICATIVE
IMPERFECTIVE
"3sg->"
lui __sarebbe__
CONDITIONAL
1
PL
INDICATIVE
IMPERFECTIVE
"1pl->"
noi __saremmo__
CONDITIONAL
2
PL
INDICATIVE
IMPERFECTIVE
"2pl->"
voi __sareste__
CONDITIONAL
3
PL
INDICATIVE
IMPERFECTIVE
"3pl->"
loro __sarebbero__
Instead of optional parameters, a single short alias, or PARTICIPLE or PAST+PARTICIPLE can also be given. With no
parameters, the infinitive form of the verb is returned.
Attributive & predicative adjectives
Italian adjectives inflect with suffixes -o → -i (masculine) and -a → -e (feminine), with some exceptions (e.g.,
grande → i grandi felini). You can get the base form with the predicative() function. A statistical
approach is used with an accuracy of 88%.
>>> from pattern.it import attributive
>>> print predicative('grandi')
grande
Parser
For parsing there is parse(),
parsetree() and split().
The parse() function annotates words in
the given string with their part-of-speech
tags (e.g.,
NN for nouns and VB for verbs). The parsetree() function takes a string and
returns a tree of nested objects (Text → Sentence → Chunk → Word). The split() function takes the output of parse() and returns a Text. See the pattern.en documentation (here) how to
manipulate Text objects.
>>> from pattern.it import parse, split
>>>
>>> s = parse('Il gatto nero faceva le fusa.')
>>> for sentence in split(s):
>>> print sentence
Sentence('Il/DT/B-NP/O gatto/NN/I-NP/O nero/JJ/I-NP/O'
'faceva/VB/B-VP/O'
'le/DT/B-NP/O fusa/NN/I-NP/O ././O/O')
The parser is mined from Wiktionary. The accuracy is around 92%.