Compare jobs

[1]:
%cd ../../../src
/Users/valery/Documents/_code/arche/src
[2]:
from arche import *
[3]:
a = Arche(source="381798/1/2", target="381798/1/1")

Let’s use the schema from Basics

[4]:
a.schema = {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "definitions": {
        "float": {
            "pattern": "^-?[0-9]+\\.[0-9]{2}$"
        },
        "url": {
            "pattern": "^https?://(www\\.)?[a-z0-9.-]*\\.[a-z]{2,}([^<>%\\x20\\x00-\\x1f\\x7F]|%[0-9a-fA-F]{2})*$"
        }
    },
    "additionalProperties": False,
    "type": "object",
    "properties": {
        "category": {"type": "string", "tag": ["category"]},
        "price": {"type": "string", "pattern": "^£\d{2}.\d{2}$"},
        "_type": {"type": "string"},
        "description": {"type": "string"},
        "title": {"type": "string", "tag": ["unique"]},
        "_key": {"type": "string"}
    },
    "required": [
        "_key",
        "_type",
        "category",
        "description",
        "price",
        "title"
    ]
}
[6]:
a.report_all()


Job Outcome:
        Finished

Job Errors:
        No errors

Responses Per Item Ratio:
        Number of responses / Number of scraped items - 1.05

Total Scraped Items:
        Same number of items

Compare Runtime:
        Similar or better runtime - 0:00:49.589000 and 0:00:55.089000

Finish Time:
        Less than 1 day difference

Fields Coverage:
        PASSED

Boolean Fields:
        SKIPPED

JSON Schema Validation:
        1000 items were checked, 1 error(s)

Tags:
        Used - category, unique
        Not used - name_field, product_price_field, product_price_was_field, product_url_field

Compare Price Was And Now:
        product_price_field or product_price_was_field tags were not found in schema

Uniqueness:
        'title' contains 1 duplicated value(s)

Duplicated Items:
        'name_field' and 'product_url_field' tags were not found in schema

Coverage For Scraped Categories:
        50 categories in 'category'

Compare Prices For Same Urls:
        product_url_field tag is not set

Compare Names Per Url:
        product_url_field tag is not set

Compare Prices For Same Names:
        name_field tag is not set




Coverage Difference (1 message(s)):

Fields Coverage (1 message(s)):

JSON Schema Validation (1 message(s)):
2 items affected - description is not of type 'string': 982 459


Uniqueness (1 message(s)):
2 items affected - same 'The Star-Touched Queen' title: 221 340


Coverage For Scraped Categories (1 message(s)):

Category Coverage Difference (1 message(s)):
[ ]: