educe.pdtb.util package

Submodules

educe.pdtb.util.args module

Command line options

educe.pdtb.util.args.add_usual_input_args(parser)

Augment a subcommand argparser with typical input arguments. Sometimes your subcommand may require slightly different output arguments, in which case, just don’t call this function.

educe.pdtb.util.args.add_usual_output_args(parser)

Augment a subcommand argparser with typical output arguments, Sometimes your subcommand may require slightly different output arguments, in which case, just don’t call this function.

educe.pdtb.util.args.announce_output_dir(output_dir)

Tell the user where we saved the output

educe.pdtb.util.args.get_output_dir(args)

Return the output directory specified on (or inferred from) the command line arguments, creating it if necessary.

We try the following in order:

  1. If –output is given explicitly, we’ll just use/create that
  2. OK just make a temporary directory. Later on, you’ll probably want to call announce_output_dir.
educe.pdtb.util.args.mk_output_path(odir, k)

Path stub (needs extension) given an output directory and a PDTB corpus key

educe.pdtb.util.args.read_corpus(args, verbose=True)

Read the section of the corpus specified in the command line arguments.

educe.pdtb.util.features module

Feature extraction library functions for PDTB corpus

class educe.pdtb.util.features.DocumentPlus(key, doc)

Bases: tuple

doc

Alias for field number 1

key

Alias for field number 0

class educe.pdtb.util.features.FeatureInput(corpus, debug)

Bases: tuple

corpus

Alias for field number 0

debug

Alias for field number 1

class educe.pdtb.util.features.RelKeys(inputs)

Bases: educe.learning.keys.MergedKeyGroup

Features for relations

fill(current, rel, target=None)

See RelSubgroup

class educe.pdtb.util.features.RelSubGroup_Core

Bases: educe.pdtb.util.features.RelSubgroup

core features

fill(current, rel, target=None)
class educe.pdtb.util.features.RelSubgroup(description, keys)

Bases: educe.learning.keys.KeyGroup

Abstract keygroup for subgroups of the merged RelKeys. We use these subgroup classes to help provide modularity, to capture the idea that the bits of code that define a set of related feature vector keys should go with the bits of code that also fill them out

fill(current, rel, target=None)

Fill out a vector’s features (if the vector is None, then we just fill out this group; but in the case of a merged key group, you may find it desirable to fill out the merged group instead)

class educe.pdtb.util.features.SingleArgKeys(inputs)

Bases: educe.learning.keys.MergedKeyGroup

Features for a single EDU

fill(current, arg, target=None)

See SingleArgSubgroup.fill

class educe.pdtb.util.features.SingleArgSubgroup(description, keys)

Bases: educe.learning.keys.KeyGroup

Abstract keygroup for subgroups of the merged SingleArgKeys. We use these subgroup classes to help provide modularity, to capture the idea that the bits of code that define a set of related feature vector keys should go with the bits of code that also fill them out

fill(current, arg, target=None)

Fill out a vector’s features (if the vector is None, then we just fill out this group; but in the case of a merged key group, you may find it desirable to fill out the merged group instead)

educe.pdtb.util.features.extract_rel_features(inputs)

Return a pair of dictionaries, one for attachments and one for relations

educe.pdtb.util.features.mk_current(inputs, k)

Pre-process and bundle up a representation of the current document

educe.pdtb.util.features.spans_to_str(spans)

string representation of a list of spans, meant to work as an id