Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read bool field from Verbnet #3044

Merged
merged 2 commits into from
Sep 5, 2022
Merged

Read bool field from Verbnet #3044

merged 2 commits into from
Sep 5, 2022

Conversation

tomaarsen
Copy link
Member

Fixes #2996

Hello!

Pull request overview

  • Read bool field from VerbNet, and store the data as a negated field in the semantics.
  • Fix the printing of this data.

The issue

See the following snippet for a segment of murder-42.1.xml, a Verbnet file:

...
            <SEMANTICS>
                <PRED value="cause">
                    <ARGS>
                        <ARG type="ThemRole" value="Agent"/>
                        <ARG type="Event" value="E"/>
                    </ARGS>
                </PRED>
                <PRED value="alive">
                    <ARGS>
                        <ARG type="Event" value="start(E)"/>
                        <ARG type="ThemRole" value="Patient"/>
                    </ARGS>
                </PRED>
                <PRED bool="!" value="alive">
                    <ARGS>
                        <ARG type="Event" value="result(E)"/>
                        <ARG type="ThemRole" value="Patient"/>
                    </ARGS>
                </PRED>
            </SEMANTICS>
...

Note the bool="!" on the last predicate. This means that the result of the event is that the Patient is not alive, while the patient starts as alive, as shown in the predicate before. However, this bool field is never read by the NLTK VerbnetCorpusReader.

Consequently, printing the frames of murder-42.1 gives:

>>> from nltk.corpus import verbnet
>>> print(verbnet.pprint_frames("murder-42.1"))
Basic Transitive
  Example: Brutus murdered Julius Cesar.
  Syntax: NP[Agent] VERB NP[Patient]
  Semantics:
    * cause(Agent, E)
    * alive(start(E), Patient)
    * alive(result(E), Patient)
NP-PP (Instrument-PP)
  Example: Caesar killed Brutus with a knife.
  Syntax: NP[Agent] VERB NP[Patient] PREP[with] NP[Instrument]
  Semantics:
    * cause(Agent, E)
    * alive(start(E), Patient)
    * alive(result(E), Patient)
    * use(during(E), Agent, Instrument)

Note that both the start and result is just alive, which is in conflict with the official semantics of murder-42.1, which clearly shows ¬alive(result(E), Patient).

Changes

The _get_semantics_within_frame method, responsible for reading this data, now also outputs a "negated" field in its dictionary, set to True if there is a bool="!" on that predicate, and False otherwise. This "negated" field is then used in _pprint_semantics_within_frame, where a ¬ is prepended to the predicate value.

Results

>>> from nltk.corpus import verbnet
>>> print(verbnet.pprint_frames("murder-42.1"))
Basic Transitive
  Example: Brutus murdered Julius Cesar.
  Syntax: NP[Agent] VERB NP[Patient]
  Semantics:
    * cause(Agent, E)
    * alive(start(E), Patient)
    * ¬alive(result(E), Patient)
NP-PP (Instrument-PP)
  Example: Caesar killed Brutus with a knife.
  Syntax: NP[Agent] VERB NP[Patient] PREP[with] NP[Instrument]
  Semantics:
    * cause(Agent, E)
    * alive(start(E), Patient)
    * ¬alive(result(E), Patient)
    * use(during(E), Agent, Instrument)

This negated value is also returned by verbnet.frames now:

>>> from nltk.corpus import verbnet
>>> from pprint import pprint
>>> pprint(verbnet.frames("murder-42.1"))
[{'description': {'primary': 'Basic Transitive', 'secondary': ''},
  'example': 'Brutus murdered Julius Cesar.',
  'semantics': [{'arguments': [{'type': 'ThemRole', 'value': 'Agent'},
                               {'type': 'Event', 'value': 'E'}],
                 'negated': False,
                 'predicate_value': 'cause'},
                {'arguments': [{'type': 'Event', 'value': 'start(E)'},
                               {'type': 'ThemRole', 'value': 'Patient'}],
                 'negated': False,
                 'predicate_value': 'alive'},
                {'arguments': [{'type': 'Event', 'value': 'result(E)'},
                               {'type': 'ThemRole', 'value': 'Patient'}],
                 'negated': True,
                 'predicate_value': 'alive'}],
  'syntax': [{'modifiers': {'selrestrs': [], 'synrestrs': [], 'value': 'Agent'},
              'pos_tag': 'NP'},
             {'modifiers': {'selrestrs': [], 'synrestrs': [], 'value': ''},
              'pos_tag': 'VERB'},
             {'modifiers': {'selrestrs': [],
                            'synrestrs': [],
                            'value': 'Patient'},
              'pos_tag': 'NP'}]},
 {'description': {'primary': 'NP-PP', 'secondary': 'Instrument-PP'},
  'example': 'Caesar killed Brutus with a knife.',
  'semantics': [{'arguments': [{'type': 'ThemRole', 'value': 'Agent'},
                               {'type': 'Event', 'value': 'E'}],
                 'negated': False,
                 'predicate_value': 'cause'},
                {'arguments': [{'type': 'Event', 'value': 'start(E)'},
                               {'type': 'ThemRole', 'value': 'Patient'}],
                 'negated': False,
                 'predicate_value': 'alive'},
                {'arguments': [{'type': 'Event', 'value': 'result(E)'},
                               {'type': 'ThemRole', 'value': 'Patient'}],
                 'negated': True,
                 'predicate_value': 'alive'},
                {'arguments': [{'type': 'Event', 'value': 'during(E)'},
                               {'type': 'ThemRole', 'value': 'Agent'},
                               {'type': 'ThemRole', 'value': 'Instrument'}],
                 'negated': False,
                 'predicate_value': 'use'}],
  'syntax': [{'modifiers': {'selrestrs': [], 'synrestrs': [], 'value': 'Agent'},
              'pos_tag': 'NP'},
             {'modifiers': {'selrestrs': [], 'synrestrs': [], 'value': ''},
              'pos_tag': 'VERB'},
             {'modifiers': {'selrestrs': [],
                            'synrestrs': [],
                            'value': 'Patient'},
              'pos_tag': 'NP'},
             {'modifiers': {'selrestrs': [], 'synrestrs': [], 'value': 'with'},
              'pos_tag': 'PREP'},
             {'modifiers': {'selrestrs': [],
                            'synrestrs': [],
                            'value': 'Instrument'},
              'pos_tag': 'NP'}]}]

I'm open to changing the "negated" dictionary key to something else, but I figured it was more descriptive than "bool".

Thank you @TMPxyz for raising #2996. As you seem familiar with Verbnet, perhaps you could review these changes briefly, and/or give your thoughts?

  • Tom Aarsen

@arademaker
Copy link

What about the links to Wordnet 3.1? I didn't see the links in the output above ", are you reading it?

@tomaarsen
Copy link
Member Author

tomaarsen commented Sep 2, 2022

Yes, those are being read:

>>> from nltk.corpus import verbnet
>>> verbnet.wordnetids("murder-42.1")
['assassinate%2:41:00', 'butcher%2:35:00', 'dispatch%2:41:01', 'eliminate%2:30:00', 'execute%2:41:00', 'execute%2:41:01', 'immolate%2:40:00', 'liquidate%2:35:00', 'massacre%2:30:00', 'murder%2:41:00', 'slaughter%2:35:00', 'slaughter%2:30:00', 'slay%2:41:00']

This PR only refers to the FRAMES section of a Verbnet file, while the links to Wordnet are in the MEMBERS section. See below for an example:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE VNCLASS SYSTEM "vn_class-3.dtd">
<VNCLASS ID="murder-42.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="vn_schema-3.xsd">
    <MEMBERS>
        <MEMBER name="assassinate" wn="assassinate%2:41:00"/>
        <MEMBER name="butcher" wn="butcher%2:35:00"/>
        <MEMBER name="dispatch" wn="dispatch%2:41:01"/>
        <MEMBER name="eliminate" wn="eliminate%2:30:00"/>
        <MEMBER name="execute" wn="execute%2:41:00 execute%2:41:01"/>
        <MEMBER name="immolate" wn="immolate%2:40:00"/>
        <MEMBER name="liquidate" wn="liquidate%2:35:00"/>
        <MEMBER name="massacre" wn="massacre%2:30:00"/>
        <MEMBER name="murder" wn="murder%2:41:00"/>
        <MEMBER name="slaughter" wn="slaughter%2:35:00 slaughter%2:30:00"/>
        <MEMBER name="slay" wn="slay%2:41:00"/>
    </MEMBERS>
    <THEMROLES>
        ...
    </THEMROLES>
    <FRAMES>            <--
        ...               |-- What this PR affects
    </FRAMES>           <--
    <SUBCLASSES>
        ...
    </SUBCLASSES>
</VNCLASS>

@stevenbird
Copy link
Member

@TMPxyz any feedback?

@stevenbird stevenbird merged commit 2e9cf65 into nltk:develop Sep 5, 2022
@tomaarsen tomaarsen deleted the pr/2996 branch September 5, 2022 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Verbnet reader doesn't process the semantics predicate bool field
3 participants