slang.snip_stats

Snip statistics functions

class slang.snip_stats.BayesFactors(pseudocount=0, tag_order=None, alphabet_size=None)[source]

BayesFactors classifier with sklearn-like interface. predict_probas are the log2 of the bayes factor. It is sklearn-like, but to avoid depending on sklearn, it isn’t a subclass of BaseEstimator and ClassifierMixin. It just has the essentials of the classifier: a fit, a predict_proba and a derived predict method, and a classes_ attribute that indices the columns of the predict_proba matrix.

predict(snips)[source]

Predict class labels for each snip of snips sequence.

class slang.snip_stats.ClassifiedMomentsFitter(n_classes_: int, initial_count: int = 1)[source]
class slang.snip_stats.PredictBF(pseudocount=0, tag_order=None, alphabet_size=None)[source]
class slang.snip_stats.PredictProbaBF(pseudocount=0, tag_order=None, alphabet_size=None)[source]
class slang.snip_stats.TagProbsBF(pseudocount=0, tag_order=None, alphabet_size=None)[source]
slang.snip_stats.bar_plot_of_tag_snip_stats(snip_stats_for_tag, snips_to_str=None, figsize=(14, 10), snip_order=None, tag_order=None, output_fig=False, ylabel_fontsize=None, ylabel_rotation=90)[source]

Multiplot of snip count bars for each tag (in a different row). First row is the total count for each snip. :param snip_count_for_tag: {tag: {snip: count, …},…} nested dict :param snips_to_str: A function that transforms snip lists into strings (mapping each snip to a character) :param figsize: :param output_fig: :param tag_order: Serves both to specify an order of the tags, and to specify a subset of tags if we don’t want all :param ylabel_rotation: Will be applied to the ylabel :return:

slang.snip_stats.df_of_snip_count_for_tag(snip_count_for_tag, snips_to_str=None, fillna=0, tag_order=None)[source]

A df representation of snip_count_for_tag :param snip_count_for_tag: {tag: {snip: count, …},…} dict :param snips_to_str: A function that transforms snip lists into strings (mapping each snip to a character) :param fillna: What to fill missing values with :param tag_order: Serves both to specify an order of the tags, and to specify a subset of tags if we don’t want all :return: A dataframe of snip (in rows) counts for each tag (in columns)

slang.snip_stats.plot_snip_count_for_tag(snip_count_for_tag, snips_to_str=None, figsize=(14, 10), tag_order=None, output_fig=False, ylabel_fontsize=None, ylabel_rotation=90)[source]

Multiplot of snip count bars for each tag (in a different row). First row is the total count for each snip. :param snip_count_for_tag: {tag: {snip: count, …},…} nested dict :param snips_to_str: A function that transforms snip lists into strings (mapping each snip to a character) :param figsize: :param output_fig: :param tag_order: Serves both to specify an order of the tags, and to specify a subset of tags if we don’t want all :param ylabel_rotation: Will be applied to the ylabel :return:

slang.snip_stats.tag_slice_iter_from_slices_of_tag_dict(slices_of_tag)[source]

Get an iterator of (tag, (bt, tt)) pairs :param slices_of_tag: a {tag: [(bt, tt),…], …} dict listing slices annotated by tags :return: a tag, (bt, tt) iterator