SPPU BE Computer Laboratory(I,II,III,IV) And Project RELATED STUDY MATERIAL: The Natural Language Processing Dictionary

An abstract noun is a noun that does not describe a physical object, for example philosophy. Contrast common noun.

An accepter is a program (or algorithm) that takes as input a grammar and a string of terminal symbols from the alphabet of that grammar, and outputs yes (or something equivalent) if the string is asentence of the grammar, and no otherwise. Contrast parser.

active arc

An active arc is a structure used by a chart parser as it attempts to parse a sentence. It is derived ultimately from a rule of the grammar being used, and consists of:

· a name for the arc,

· a type - the phrasal category being sought,

· a list of found constituents, i.e. constituents required by a grammar rule

· a list of types of the constituents not yet found,

· a from position, indicating the position in the sentence of the start of the first found constituent, and

· a to position, indicating the position in the sentence of the end of the last found constituent.

The symbol → is used to separated the type from the list of found constituents, and a dot is used to separate the list of found constituents from the list of types of constituents not yet found.

active chart

= chart (in a chart parser)

active voice

Sentences in English may be in active or passive form. The active form makes the one who is performing the action in the sentence (termed the agent in semantics) the grammatical subject. This serves to focus attention on the agent. Example: "John ate the pizza". The alternative, the passive voice, makes the thing acted on into the grammatical subject, thus focussing attention on that thing, rather than on the agent. Example: "The pizza was eaten by John." Many writers appear to believe that use of the passive seems more formal and dignified, and consequently it is over-used in technical writing. For example, they might write "The following experiments were performed" when it would be clearer to say "We [i.e. the authors] performed the following experiments."

Contrast mood, tense, and aspect.

ADJ

symbol used in grammar rules for an adjective.

adjective

An adjective is a word that modifies a noun by specifying an attribute of the noun. Examples include adjectives of colour, like red, size or shape, like round or large, along with thousands of less classifiable adjectives like willing, onerous, etc. In grammar rules, we use the symbol ADJ for the pre-terminal category of adjectives.

Adjectives are also used as the complements of sentences with verbs like "be" and "seem" - "He is happy", "He seems drunk".

ADJ is a lexical grammatical category.

adjective phrase

Adjective Phrase (or adjectival phrase) is a phrasal grammatical category. Adjective phrase is usually abbreviated to ADJP. They range from simple adjectives (like "green" in "the green grass") through short lists of adjectives possibly modified by an adverb or so (like "really large, cream" in "that really large, cream building") to fairly complicated constructs like "angry that he had been ignored" in "Jack was angry that he had been ignored". The longer adjective phrases are frequently take the form of an adjective followed by a complement, which might be a "that"+Sentence complement (as in "angry that he had been ignored"), or a PP complement or a "to"+VP complement.

The longer ADJPs are most often found as complements of verbs such as "be" and "seem". ADJP is a phrasal grammatical category.

ADJP

symbol used in grammar rules for an adjective phrase.

ADV

symbol used in grammar rules for an adverb.

adverb

An adverb is a word that modifies a verb, ("strongly", in "she swam strongly") an adjective, ("very", in "a very strong swimmer") or another adverb ("very", in "she swam very strongly").

Many adverbs end with the morpheme -ly, which converts an adjective X into an adverb meaning something like "in an X manner" - thus "bravely" = "in a brave manner". Other adverbs includeintensifiers like "very" and "extremely". There are also adverbs of time (like "today", "tomorrow", "then" - as in "I gave him the book then"), frequency ("never", "often"), and place ("here", "there", and "everywhere").

ADV is a lexical grammatical category.

adverbial phrase

Adverbial phrases are phrases that perform one of the functions of an adverb. They include simple phrases that express some of the same types of concepts that a single adverb might express, such as frequency - "every week", duration - "for three weeks", time - "at lunchtime", and manner - "this way" ("Do it this way"), or "by holding his head under water for one minute".

Adverbial Phrase is a phrasal grammatical category. Adverbial phrase is usually abbreviated to ADVP.

ADVP

symbol used in grammar rules for an adverbial phrase.

agent

A AGENT is a case used in logical forms. It signifies the entity that is acting in an event. It normally corresponds to the syntactic subject of an active voice declarative sentence. In the logical form for a state description, the term EXPERIENCER is used for the corresponding entity. AGENTs appear in the frame-like structures used to describe logical forms: e.g. the following, representing "John breaks it with the hammer":

break1(e1,	agent[name(j1, 'John')]
	theme[pro(i1, it1)]
	instr(the<h1, hammer1>])

agreement

Agreement is the phenomenon in many languages in which words must take certain inflections depending on the company they keep. A simple case occurs with verbs in the third person singular form and their singular subjects: "Jane likes cheese" is correct, but * "Jane like cheese" and * "My dogs likes cheese" are not, because the subjects and verbs do not agree on the number feature. The name used in the lecture notes for the agreement feature is agr. The possible values of the agr feature are 1s, 2s, 3s, 1p, 2p, 3p, signifying 1st person singular, 2nd person singular, ..., 3rd person plural. Pronouns like "I" and "me" have agr=1s, "you" has agr={2s,2p} as it is not possible to distinguish singular from plural in this case, and so on. Definite noun phrases like "the green ball" have agr=3s.

Allen

This refers to the book by James Allen, Natural Language Processing, second edition, Benjamin Cummings, 1995.

alphabet (in grammar)

The "alphabet" of a grammar is the set of symbols that it uses, including the terminal symbols (which are like words) and the non-terminal symbols which include the grammatical categories like N (noun), V (verb), NP (noun phrase), S ( sentence), etc.

See also context-free grammar, and context-sensitive grammar.

ambiguity

An ambiguity is a situation where more than one meaning is possible in a sentence. We consider three types of ambiguity:

word-sense ambiguity structural ambiguity referential ambiguity

There can be situations where more than one of these is present.

anaphor

An anaphor is an expression that refers back to a previous expression in a natural language discourse. For example: "Mary died. She was very old." The word she refers to Mary, and is described as an anaphoric reference to Mary. Mary is described as the antecedent of she. Anaphoric references are frequently pronouns, as in the example, but may also be definite noun phrases, as in: "Ronald Reagan frowned. The President was clearly worried by this issue." Here The President is an anaphoric reference to Ronald Reagan. Anaphors may in some cases not be explicitly mentioned in a previous sentence - as in "John got out his pencil. He found that the lead was broken." The lead here refers to a subpart of his pencil. Anaphors need not be in the immediately preceding sentence, they could be further back, or in the same sentence, as in "John got out his pencil, but found that the lead was broken." In all our examples so far the anaphor and the antecedent are noun phrases, but VP and sentence-anaphora is also possible, as in "I have today dismissed the prime minister. It was my duty in the circumstances." Here It is an anaphoric reference to the VP dismissed the prime minister.

For a fairly complete and quite entertaining treatment of anaphora, see Hirst, G. Anaphora in Natural Language Understanding: A Survey Springer Lecture Notes in Computer Science 119, Berlin: Springer, 1981.

animate

A feature of some noun phrases. It indicates that the thing described by the noun phrase is alive, and so capable of acting, i.e. being the agent of some act. This feature could be used to distinguish between The hammer broke the window and The boy broke the window - in the former, the hammer is not animate, so cannot be the agent of the break action, (it is in fact the instrument), while the boyis animate, so can be the agent.

antecedent

See anaphor.

apposition

A grammatical relation between a word and a noun phrase that follows. It frequently expresses equality or a set membership relationship. For example, "Rudolph the red-nosed reindeer [had a very shiny nose]" - here Rudolph = the unique red-nosed reindeer. Another example, "Freewheelin' Franklin, an underground comic-strip character, [was into drugs and rock music]", expresses a set membership relation: Freewheeling_Franklin in "underground comic-strip characters".

article

Words like "the", "a", and "an" in English. They are a kind of determiner. See also the quantifying logical operator THE.

aspect

The phrase "I am reading" is in the progressive aspect, signifying that the action is still in progress. Contrast this with "I read" which does not likely refer to an action that is currently in progress. Aspect goes further than this, but we shall not pursue the details of aspect in this subject. If interested, you could try Huddleston, R., "Introduction to the Grammar of English" Cambridge, 1984, pp. 157-158 and elsewhere.

ATN

= augmented transition network

augmented grammar

An augmented grammar is what you get if you take grammar rules (usually from a context-free grammar) and add extra information to them, usually in the form of feature information. For example, the grammar rule s → np vp can be augmented by adding feature information to indicate that say the agr feature for the vp and the np must agree:

s(agr(?agr)) → np(agr(?agr)) vp(agr(?agr))

In Prolog, we would write something like:

s(P1, P3, Agr) :- np(P1, P2, Agr), vp(P2, P3, Agr).

Actually, this is too tough - the agr feature of a VP, in particular, is usually fairly ambiguous - for example the verb "love" (and so any VP of which it is the main verb) has agr=[1s,2s,1p,2p,3p], and we would want it to agree with the NP "we" which has agr=[1p]. This can be achieved by computing the intersection of the agr of the NP and the VP and setting the agr of the S to be this intersection, provided it is non-empty. If it is empty, then the S goal should not succeed.

s(P1, P3, SAgr) :-

np(P1, P2, NPAgr),

vp(P1, P2, VPAgr),

intersection(NPAgr, VPAgr, SAgr),

nonempty(SAgr).

where intersection computes the intersection of two lists (regarded as sets) and binds the third argument to this intersection, and nonempty succeeds if its argument is not the empty list.

Augmented grammar rules are also used to record sem and var features in computing logical forms, and to express the relationship between the sem and var of the left-hand side and the sem(s) andvar(s) of the right-hand side. For example, for the rule vp → v (i.e. an intransitive verb), the augmented rule with sem feature could be:

vp(sem(lambda(X, ?semv(?varv, X))), var(?varv)) →
v(subcat(none), sem(?semv), var(?varv))

where subcat none indicates that this only works with an intransitive verb.

augmented transition network

A parsing formalism for augmented context free grammars. Not covered in current version of COMP9414, but described in Allen.

AUX

symbol used in grammar rules for an auxiliary verb.

auxiliary verb

A "helper" verb, not the main verb. For example, in "He would have read the book", "would" and "have" are auxiliaries. A reasonably complete list of auxiliary verbs in English is:

Auxiliary	Example
do/does/did	I did read
have/has/had/having	He has read
be/am/are/is/was/were/been/being	He is reading
shall/will/should/would	He should read
can, could	She can read
may, might, must	She might read

Complex groupings of auxiliaries can occur, as in "The child may have been being taken to the movies".

Some auxiliaries (do, be, and have) can also occur as verbs in their own right.

Auxiliary verb is often abbreviated to AUX.

AUX is a lexical grammatical category.

Bayes' rule

This statistical rule relates the conditional probability Pr(A | B) to Pr(B | A) for two events A and B. The rule states that

Pr(A | B) = Pr(B | A) × Pr(A) / Pr(B)

BELIEVE

BELIEVE is a modal operator in the language for representing logical forms. BELIEVE and other operators like it have some unexpected properties such as failure of substitutivity. For more details, read page 237 in Allen. Page 542 ff. provides yet more on belief in NLP (but this material is well beyond the needs of COMP9414).

bigram

A bigram is a pair of things, but usually a pair of lexical categories. Suppose that we are concerned with two lexical categories L1 and L2. The term bigram is used in statistical NLP in connection with the conditional probability that a word will belong to L2 given that the preceding word was in L1. This probability is written Pr(L2 | L1), or more fully Prob(w[i] in L2 | w[i-1] in L1). For example, in the phrase "The flies", given that The is tagged with ART, we would be concerned with the conditional probabilities Pr(N | ART) and Pr(V | ART) given that flies can be tagged with N and V.

bottom-up parser

A parsing method that proceeds by assembling words into phrases, and phrases into higher level phrases, until a complete sentence has been found. Contrast top-down.

The chart parser described in lectures is a bottom-up parser, and can parse sentences, using any context-free grammar, in cubic time: i.e., in time proportional to the cube of the number of words in the sentence.

bound morpheme

A bound morpheme is a prefix or suffix, which cannot stand as a word in its own right, but which, can be attached to a free morpheme and modify the meaning of the free morpheme. For example, "happy" is a free morpheme, which becomes "unhappily" when the prefix "un-", and suffix "-ly", both bound morphemes, are attached.

cardinal

Number words like one, two, four, twenty, fifty, hundred, million. Contrast ordinal.

case

The term case is used in two different (though related) senses in NLP and linguistics. Originally it referred to what is now termed syntactic case. Syntactic case essentially depends on the relationship between a noun (or noun phrase) and the verb that governs it. For example, in "Mary ate the pizza", "Mary" is in the nominative or subject case, and "the pizza" is in the accusative or object case. Other languages may have a wider range of cases. English has remnants of a couple more cases - genitive (relating to possession, as with the pronoun "his") and dative (only with ditransitive verbs - the indirect object of the verb is said to be in the dative case).

Notice that in "The pizza was eaten by Mary", "the pizza" becomes the syntactic subject, whereas it was the syntactic object in the equivalent sentence "Mary ate the pizza".

With semantic case, which is the primary sense in which we are concerned with the term case in COMP9414, the focus is on the meaning-relationship between the verb and the noun or noun phrase. Since this does not change between "Mary ate the pizza" and "The pizza was eaten by Mary", we want to use the same syntactic case for "the pizza" in both sentences. The term used for the semantic case of "the pizza" is theme. Similarly, the semantic case of "Mary" in both versions of the sentence is agent. Other cases frequently used include instrument, coagent, experiencer, at-loc, from-loc, andto-loc, at-poss, from-poss, and to-poss, at-value, from-value, and to-value, at-time, from-time, and to-time, and beneficiary.

Semantic cases are also referred to as thematic roles.

cataphor

Opposite of anaphor, and much rarer in actual language use. A cataphor is a phrase that is explained by text that comes after the phrase. Example: "Although he loved fishing, Paul went skating with his girlfriend." Here he is a cataphoric reference to Paul.

CFG

= context-free grammar

chart

A chart is a data structure used in parsing. It consists of a collection of active arcs (sometimes also called edges), together with a collection of constituents, (sometimes also called inactive arcs orinactive edges.

component	example	NP1: NP → ART1 ADJ1 N1 from 0 to 3
name	NP1	usually formed from the type + a number
type	NP	a phrasal or lexical category of the grammar
decomposition	ART1 ADJ1 N1	(ART1, ADJ1 and N1 would be the names of other constituents already found)
from	0	sentence position of the left end of this NP
to	3	sentence position of the right end of this NP

P	A set of grammar rules or productions, that is, items of the form X → a, where X is a member of the set N, that is, a non-terminal symbol, and a is a string over the alphabet A. An example would be the rule NP → ART ADJ N which signifies that a Noun Phrase can be an ARTicle followed by an ADJective followed by a Noun, or N → horse, which signifies that horse is a Noun. NP, ART, ADJ, and N are all non-terminal symbols, and horse is a terminal symbol.
A	the alphabet of the grammar, equal to the disjoint union of N and T
N	the set of non-terminal symbols (i.e. grammatical or phrasal categories)
T	the set of terminal symbols (i.e. words of the language that the grammar defines)
S	a distinguished non-terminal, normally interpreted as representing a full sentence (or program, in the case of a programming language grammar)

S	⇒	NP VP	rule 1
	⇒	ART N VP	rule 2
	⇒	the N VP	rule 4
	⇒	the cat VP	rule 5
	⇒	the cat V	rule 3
	⇒	the cat miaowed	rule 6

type	masculine	feminine	neuter	example
pronoun (nominative)	he	she	it	he hit the ball.
pronoun (accusative)	him	her	it	Frank hit him.
pronoun(possessive adjective)	his	her	its	Frank hit his arm.
pronoun(possessive)	his	hers	its	The ball is his.
pronoun(reflexive)	himself	herself	itself	Frank hurt himself.

Mood	Description	Example
indicative	A plain statement	John eats the pizza
imperative	A command	Eat the pizza!
WH-question	A question with a phrasal answer, often starting with a question-word beginning with "wh"	Who is eating the pizza? What is John eating? What is John doing to the pizza?
Y/N-question	A question with yes/no answer	Did John eat the pizza?
subjunctive	An embedded sentence that is counter-factual but must be expressed to, e.g. explain a possible consequence.	If John were to eat more pizza he would be sick.

SPPU BE Computer Laboratory(I,II,III,IV) And Project RELATED STUDY MATERIAL

Wednesday, 12 October 2016

The Natural Language Processing Dictionary

No comments:

Post a Comment

nouns	Boys and girls [come out to play].
adjectives	[The team colours are] black and yellow.

adverbs	[He was] well and truly [beaten].
verbs	[Mary] played and won [her match].
phrases	across the river and into the trees [She] fell down and hit her head.

case	first person	second person	third person
nominative	I/we	thou/you/ye	he/she/it/they
accusative	me/us	thee/you/ye	him/her/it/them
possessive adjective	mine/ours	thine/yours	his/hers/its/theirs
possessive	my/our	thy/your	his/her/its/their
reflexive	myself/ourselves	thyself/yourself yourselves	himself/herself itself/themselves

take in	deceive	He was taken in by the swindler
take in	help, esp. with housing	The homeless refugees were taken in by the Sisters of Mercy
take up	accept	They took up the offer of help.
take off	remove	She took off her hat.

person & number	possessive adjective	possessive pronoun
first person singular	my	mine
first person plural	our	ours
second person singular (archaic)	thy	thine
second person (modern)	your	yours
third person singular	his/her/its	his/hers/its
third person plural	their	theirs

Value	Example Verb	Example of Use
none	laugh	Jack laughed
np	find	Jack found a key
np_np	give	Jack gave Sue the paper
vp:inf	want	Jack wants to fly
np_vp:inf	tell	Jack told the man to go
vp:ing	keep	Jack keeps hoping for the best
np_vp:ing	catch	Jack caught Sam looking at his desk
np_vp:base	watch	Jack watched Sam look at his desk
np_pp:to	give	Jack gave the key to the man
pp:loc	be	Jack is at the store
np_pp:loc	put	Jack put the box in the corner
pp_mot	go	Jack went to the store
np_pp:mot	take	Jack took the hat to the party
adjp	be, seem	Jack is happy
np_adjp	keep	Jack kept the dinner hot
s:that	believe	Jack believed that sharks wear wedding rings
s:for	hope	Jack hoped for Mary to eat the pizza.

Form	Example	Meaning
present	He drives a Ford	The "drive" action occurs in the present, though it suggests that this is habitual - it may have occurred in the past and may continue in the future.
simple past	He drove a Ford	The action occurred at some time in the past.
simple future	He will drive a Ford	The action will occur at some time in the future.
past perfect or pluperfect	He had driven a Ford	At some time in the past, it was true to say "He drove a Ford".
future perfect	He will have driven a Ford	At some point in the future, it will be true to say "He drove a Ford".

(THE d1 : (DOG1 d1) (BARKS1 d1))	(Allen),	or
the(d1 : dog1(d1), barks1(d1))	(COMP9414)

Example	Type of complement
Jane laughed.	empty
Jane ate the pizza.	NP
Jane believed Paul ate the pizza.	S
Jane wanted to eat the pizza.	to+VP
Jane gave Paul the pizza.	NP+NP
Jane was happy to eat the pizza.	ADJP

VFORM	Example	Comment
base	break, be, set, decide	base form
pres	break, breaks, am, is, are, set, sets, decide, decides	simple present tense
past	broke, was, were, set, decided	simple past tense
fin	-	finite = tensed = pres or past
ing	breaking, being, setting, deciding	present participle
pastprt	broken, been, set, decided	past participle
inf	-	used for infinitive forms with to