API

[1]:
%cd ../../../src
/Users/valery/Documents/_code/arche/src

Getting data

[2]:
from arche.readers.items import JobItems
[3]:
job_items = JobItems(key="381798/1/1")
[4]:
job_items.df.head()

[4]:
_key _type category description price title
0 https://app.scrapinghub.com/p/381798/1/1/item/0 dict Travel “Wherever you go, whatever you do, just . . . ... £45.17 It's Only the Himalayas
1 https://app.scrapinghub.com/p/381798/1/1/item/1 dict Politics Libertarianism isn't about winning elections; ... £51.33 Libertarianism for Beginners
2 https://app.scrapinghub.com/p/381798/1/1/item/2 dict Science Fiction Andrew Barger, award-winning author and engine... £37.59 Mesaerion: The Best Science Fiction Stories 18...
3 https://app.scrapinghub.com/p/381798/1/1/item/3 dict Poetry Part fact, part fiction, Tyehimba Jess's much ... £23.88 Olio
4 https://app.scrapinghub.com/p/381798/1/1/item/4 dict Music This is the never-before-told story of the mus... £57.25 Our Band Could Be Your Life: Scenes from the A...

Running rules

[5]:
from arche.rules.duplicates import find_by
[6]:
find_by(job_items.df, ["title", "category"]).show()

Items Uniqueness By Columns:
        2 duplicate(s) with same title, category
2 items affected - same 'The Star-Touched Queen' title 'Fantasy' category: 221 341

[ ]: