We’re always happy with any other questions you might have. Send us an email at [email protected]
Enrich Webz Web Archives with Tisane Entity Extraction
Top companies trust Datastreamer to integrate, enrich, join, and apply their web data needs.
About Webz Web Archives
Historical combined datasets from across the web.
About Tisane Entity Extraction
Detect mentions of people, organizations, locations, filenames, phone numbers, crypto addresses, and more.
Entities are elements of relevance or interest in the text. Tisane extracts both standard entities and those relevant to trust & safety/law enforcement applications.
Standard entities are names of people, their social roles, organizations, places, and so on. We also extract cryptocurrency addresses, bank accounts, credit card numbers, phone numbers, software package names, and more.
Every entity entry is an object made of:
type - the type of the entity
name - a standard name, if exists; otherwise, the string that was logged
subtypes - more detailed additional types
subtype - the first subtype (for backward compatibility purposes)
mentions - an array of all detected mentions, with:
Quickly enrich Webz Web Archives with Tisane Entity Extraction with a Datstreamer Pipeline.
Step 1
Start your Pipeline with Webz Web Archives
Web data serves as the foundational input for any data pipeline. Pipelines can be powered by diverse data sources, including datasets from our partner ecosystem, proprietary internal systems, or any externally accessible web data.
Step 2
Add Tisane Entity Extraction to enrich
Transform your web data at scale with Datastreamer. Whether you're enriching, storing, joining, or filtering, you'll find hundreds of ready-made operations to help you move fast.
Step 3
That's it! You have just connected Webz Web Archives and Tisane Entity Extraction
Datastreamer transforms how you use web data. Grow your Pipelines without disruption and finally streamline the operational side of your workflow.