arche.data_quality_report module¶
-
class
arche.data_quality_report.
DataQualityReport
(items: arche.readers.items.Items, schema: Dict[str, Dict[str, Union[str, bool, int, float, None, List[T]]]], report: arche.report.Report, bucket: Optional[str] = None)¶ Bases:
object
-
coverage_by_categories
(df, tagged_fields)¶ Makes tables which show the number of items per category, set up with a category tag
- Parameters
df – a dataframe of items
tagged_fields – a dict of tags
-
create_appendix
(schema)¶
-
create_figures
(items, items_dicts)¶
-
drop_service_columns
(df)¶
-
job_summary_table
(job)¶
-
plot_html_to_stream
()¶
-
plot_to_notebook
()¶
-
rules_summary_table
(df, no_of_validation_warnings, name_field, url_field, no_of_checked_duplicated_items, no_of_duplicated_items, unique, no_of_checked_skus, no_of_duplicated_skus, price_field, price_was_field, no_of_checked_price_items, no_of_price_warns, **kwargs)¶
-
save_report_to_bucket
(project_id, spider, bucket)¶
-
score_table
(quality_estimation, field_accuracy)¶
-
scraped_fields_coverage
(job, df)¶
-
scraped_items_history
(job_no, job_numbers, date_items)¶
-