README - NOT UPDATED!

Dataset is read by CoNLL_Reader. You can directly reference a dataset entry by subscribing with entry index. To use an encoding scheme you need to initialize one underparadigms.py. The most recent scheme that is of researcher's interest is paradigms.DIRECTTAG. A word's role to a predicate is indicated by labeling the role explicitly and stating as a prefix of (>){1,mul}|(<){1,mul} This takes two parameters in initialization. First one mul indicating the furthest distance a verb can be away from a given label in terms of verb distance (how many other verbs are in between). Second one is ommitlemma depending on which verb's lemma is not included but only its sense (e.g V01 instead of Vmake.01).

    from semantictagger.paradigms import DIRECTTAG
    from semantictagger.dataset import Dataset
    from pathlib import Path 

    fp = Path("./UP_English-EWT/en_ewt-up-train.conllu")
    dataset = Dataset(fp)

    # Get 10th sentence
    entry = dataset[10]

    # Get all SRL annotations
    # Depth indicates how many verbs there are.
    annotations = [entry.get_srl_annotation(d) for d in range(entry.depth)]

    #Init tagger
    dirtag = DIRECTTAG(2, omitlemma=True)

    # Encode and decode
    encoded = dirtag.encode(entry)
    decoded = dirtag.decode(encoded)

    #Or simply call
    # Which returns how many labeled could be retrieved once
    # the tagger is applied to the given entry
    countcorrecttags , countfalsetags = dirtag.test(entry)

    # Returns
    # -------
    # Sparsity :How many empty tags there are compared to nonempty ones.
    # Mean of the set (emptytag excluded)
    # Std (emptytag excluded)
    # Frequency dictionary
    # Results are printed if show_results = True
    sparsity , mean , std , dict_ = dataset.getlabelfrequencies(dirtag ,show_results = True , returndict = True)

    from semantictagger.datastats import collectaccuracy

    """
        Collects info from test results over entire dataset.

        Return
        ------
        correct : How many sentences correctly labeled
        false : How many sentences incorrectly labeled
        singlecorrect : How many tokens correctly labeled
        singlefalse : How many tokens incorrectly labeled

    """
    (correct , false), (singlecorrect ,singlefalse) = collectaccuracy(dirtag , dataset , showresults= True)

TODO

Dataset entry no 269 has two verbs in one annotation level. How to solve this problem?
Sentence no. 5081 has two exactly same semantic role layers (verbs coincide). What to do with it ?

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
UP_English-EWT		UP_English-EWT
data		data
evaluation/conll05		evaluation/conll05
logs		logs
model/upos/goldpos/goldframes		model/upos/goldpos/goldframes
modelout		modelout
semantictagger		semantictagger
test		test
.gitignore		.gitignore
1stMeeting.ipynb		1stMeeting.ipynb
README.md		README.md
ccformat.py		ccformat.py
deneme.py		deneme.py
eval.py		eval.py
example.py		example.py
experiments.py		experiments.py
main.py		main.py
model.py		model.py
modelembed.py		modelembed.py
observations.md		observations.md
posembedding.py		posembedding.py
prddev.tsv		prddev.tsv
predict.py		predict.py
rodev.tsv		rodev.tsv
spanize.py		spanize.py
test.py		test.py
train.py		train.py
verbembed.py		verbembed.py

alibektas/SRL-as-SequenceLabeling

Folders and files

Latest commit

History

Repository files navigation

README - NOT UPDATED!

TODO

About

Resources

Stars

Watchers

Forks

Languages