How to get the logical part of a sentence in Java?

Suppose there is a sentence:

On March 1,he was born.

Change it to

He was born on March 1.

Without breaking the meaning of the sentence, it is still valid Reorganizing words in any other way will have a strange effect on invalid sentences So basically, I'm talking about part of a sentence. It makes the information more specific, but deleting them doesn't destroy the whole sentence Are there any NLP libraries that can identify these components?

Solution

component

It sounds like you want to identify the constructs of sentences, which are groups of words that run as a single unit according to the grammar of the language

In fact, when linguistics tries to discover the grammar of a language, they do so in part by looking at movement In your example, this is a group of words that can be moved to different positions in the sentence while still retaining meaning

Components can be single words, phrases, or even larger groups, such as the entire clause In a sentence, they have a nested hierarchy For example, the first example sentence you give can be analyzed as:

(S  (PP (IN On) (NP (NNP March) (CD 1)))
    (NP (PRP he))
    (VP (VBD was) (VP (VBN born))))

The whole sentence consists of prepositional phrase, followed by noun phrase, followed by verb phrase Prepositional phrases can be further decomposed into units consisting of a single word "on" followed by a noun phrase

Phrase structure parser

To find components automatically, you may need to use a phrase structure parser There are many such parsing options that can be used as open source, including:

>Stanford parser (Java) > Berkeley parser (Java) > bllip (charniak Johnson) parser (c) > bike parser (this is a re implementation and improved version of Collins parser written in Java) > Collins parser (c) > OpenNLP parser (Java) > sharpnlp parser (C #)

Stanford and Berkeley parsers are probably the easiest to install and use As shown by cer et al. 2010, the most accurate parsers are Berkeley and charniak The bike parser is slower and less accurate than other parsers

Online demonstration

There is an online demonstration of Stanford parser here I use this demonstration to generate the parsing of the example sentence given above

Notes on deletion

In each component, there will be a head word For example, name word phrase:

(NP (DT the) (JJ large) (JJ blue) (NN ball))

The first word here is the noun ball, which is modified by the adjectives big and blue If the noun phrase is embedded in a sentence, you can delete those modifiers and still have the same meaning as the original sentence, but less specific content

In noun phrases, you can usually delete adjectives, non - Head Nouns and nested prepositional phrases

In phrasal verbs and full clauses, things get trickier because deleting servers as verb parameters can completely change the interpretation of sentences For example, deleting the book from the book he sold Jim led him to sell Jim

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>