How to use Datasets
What is a Dataset?
Datasets are like normal databases (e.g., a spreadsheet, CSV, Airtable, etc) that you can use alongside data that you’ve extracted in Impira. Use them to reconcile, connect, and unlock data stored in outside locations to your data in Impira.
You can create rows, select field types, and enter in any data you want.
What can I do with Datasets?
You can import a database as a CSV file into a Dataset and join a field in that database with the same field in an existing Collection. Continue reading about this use case below.
Search and query custom data
Once you either create a Dataset with manually inputted data or via CSV upload, you can now search this data by filtering and use IQL queries to shape that data.
How to create a Dataset from scratch
- Find Datasets in the left sidebar.
- Click the + and name your new Dataset.
Step 2: Adding fields
- Add fields (i.e., columns) by clicking the + sign at the top of the Dataset.
- Enter a field name, then choose your field type and value type. Click Create.
- Once you finish adding all your column headers, create as many rows as you need by selecting + Create Row at the top of the dataset.
- Click on any cell to add or edit values.
How to open CSV files as a Dataset
- Upload your CSV file to All files.
- After it finishes processing, double-click the file to open it.
- Select Open file as Dataset.
You can now filter, search, and edit your data, as well as add additional fields and rows. You can also create a join field to link your new Dataset to other Datasets, Collections, or Views.
Example use case: Joining data
You work for a company called Benson’s Boutique Goods that receives thousands of invoices per month from various vendors that contain data fields like Company name, Date, Vendor ID, and Total.
Benson’s keeps a spreadsheet called Vendor Database that also contains the Vendor ID field, but it also has a lot of additional information about each vendor not available on an invoice that the accounting department needs.
In order for the accounting department to process invoices they need all the vendor data available in the Vendor Database. Instead of manually searching for the appropriate Vendor ID, they need a way to automatically look up vendor data that’s tied to a specific invoice and have all that data sent to them in a consolidated manner.
Both the invoices and Vendor Database contain the field, Vendor ID, so you can use that field as a bridge between the two. The bridge is formed by creating a join field between the extracted Vendor ID from the invoices and the existing Vendor ID field in the Vendor Database.
This returns all associated vendor information from the Vendor Database for each corresponding invoice as a consolidated join field.
How to set up this workflow
Upload Vendor Database as a CSV file and create a Dataset. (See the section How to open CSV files as a Dataset.)
Create a join field to link the Vendor ID field of your new Dataset to the Vendor ID field from your existing Collection of invoices.
- Go to the Vendor Invoices Collection and open any file. Click New field and hover over the three dots to see the Advanced menu.
- Name your new join field — e.g., “Vendor join”.
- For “Field on Vendor invoices” choose Vendor ID.
- For “Second Collection, Dataset, or View” choose Vendor Database.csv.
- For “Field on second Collection, Dataset, or View” choose Vendor ID.
- Hit Create field.
You can see in the screenshot below that your join field in the Vendor Invoices Collection has pulled in all the associated data from the Vendor Database Dataset, using the Vendor ID field to connect the two: